`std::set` 有什么问题?

2022-01-07 00:00:00 string set c++ stl remove-if

在另一个主题中,我试图解决这个问题.问题是从 std::string 中删除重复的字符.

In the other topic I was trying to solve this problem. The problem was to remove duplicate characters from a std::string.

std::string s= "saaangeetha";

由于顺序不重要,所以我先对s进行排序,然后使用std::unique,最后调整大小得到想要的结果:

Since the order was not important, so I sorted s first, and then used std::unique and finally resized it to get the desired result:

aeghnst

没错!

现在我想做同样的事情,但同时我希望字符的顺序完好无损.意思是,我想要这个输出:

Now I want to do the same, but at the same time I want the order of characters intact. Means, I want this output:

sangeth

所以我写了这个:

template<typename T>
struct is_repeated
{
    std::set<T>  unique;
    bool operator()(T c) { return !unique.insert(c).second; }
}; 
int main() {
    std::string s= "saaangeetha";
    s.erase(std::remove_if(s.begin(), s.end(), is_repeated<char>()), s.end()); 
    std::cout << s ;
}

给出这个输出:

saangeth

也就是说,a 被重复了,虽然其他的重复没有了.代码有什么问题?

That is, a is repeated, though other repetitions gone. What is wrong with the code?

无论如何我更改我的代码:(见评论)

Anyway I change my code a bit: (see the comment)

template<typename T>
struct is_repeated
{
    std::set<T> & unique;  //made reference!
    is_repeated(std::set<T> &s) : unique(s) {} //added line!
    bool operator()(T c) { return !unique.insert(c).second; }
}; 
int main() {
    std::string s= "saaangeetha";
    std::set<char> set; //added line!
    s.erase(std::remove_if(s.begin(),s.end(),is_repeated<char>(set)),s.end()); 
    std::cout << s ;
}

输出:

sangeth

问题解决了!

那么第一个解决方案有什么问题?

So what is wrong with the first solution?

另外,如果我不使成员变量 unique 引用类型,那么 问题不去.

Also, if I don't make the member variable unique reference type, then the problem doesn't go.

std::setis_repeated 函子有什么问题?问题究竟出在哪里?

What is wrong with std::set or is_repeated functor? Where exactly is the problem?

我还注意到,如果 is_repeated 函子被复制到某处,那么它的每个成员也会被复制.我在这里看不到问题!

I also note that if the is_repeated functor is copied somewhere, then every member of it is also copied. I don't see the problem here!

推荐答案

在 GCC (libstdc++) 中,remove_if 基本上实现为

    template<typename It, typename Pred>
    It remove_if(It first, It last, Pred predicate) {
      first = std::find_if(first, last, predicate);
    //                                  ^^^^^^^^^
      if (first == last)
         return first;
      else {
         It result = first;
         ++ result;
         for (; first != last; ++ first) {
           if (!predicate(*first)) {
    //          ^^^^^^^^^
              *result = std::move(*first);
              ++ result;
           }
         }
      }
    }

请注意,您的谓词被按值传递给find_if,因此在find_if中修改的结构和集合不会传播回调用者.

Note that your predicate is passed by-value to find_if, so the struct, and therefore the set, modified inside find_if will not be propagated back to caller.

由于第一个副本出现在:

Since the first duplicate appears at:

  saaangeetha
//  ^

初始 "sa" 将在 find_if 调用后保留.同时,predicate 的集合是空的(find_if 中的插入是本地的).因此之后的循环将保留第三个 a.

The initial "sa" will be kept after the find_if call. Meanwhile, the predicate's set is empty (the insertions within find_if are local). Therefore the loop afterwards will keep the 3rd a.

   sa | angeth
// ^^   ^^^^^^
// ||   kept by the loop in remove_if
// ||
// kept by find_if

相关文章