如何使向量的元素唯一?(删除不相邻的重复项)

2021-12-21 00:00:00 unique vector c++ stl

我有一个包含很少非相邻重复项的向量.

I have a vector containing few non-adjacent duplicates.

举一个简单的例子,考虑:

As a simple example, consider:

2 1 6 1 4 6 2 1 1

我试图通过删除不相邻的重复项并保持元素的顺序来使这个 vector 独一无二.

I am trying to make this vector unique by removing the non-adjacent duplicates and maintaining the order of elements.

结果是:

2 1 6 4 

我尝试的解决方案是:

  1. 插入 std::set 但这种方法的问题是它会扰乱元素的顺序.
  2. 使用 std::sort 和 std::unique 的组合.但同样的顺序问题.
  3. 手动消除重复:

  1. Inserting into a std::set but the problem with this approach is that it will disturb the order of elements.
  2. Use the combination of std::sort and std::unique. But again same order problem.
  3. Manual duplicate elimination:

    Define a temporary vector TempVector.
    for (each element in a vector)
    {
        if (the element does not exists in TempVector)
        {
            add to TempVector;
        }
    }
    swap orginial vector with TempVector.

我的问题是:

是否有任何 STL 算法可以从向量中删除不相邻的重复项?它的复杂程度如何?

Is there any STL algorithm which can remove the non-adjacent duplicates from the vector ? what is its complexity?

推荐答案

不使用临时 set 就可以做到这一点,但(可能)会损失一些性能:

Without using a temporary set it's possible to do this with (possibly) some loss of performance:

template<class Iterator>
Iterator Unique(Iterator first, Iterator last)
{
    while (first != last)
    {
        Iterator next(first);
        last = std::remove(++next, last, *first);
        first = next;
    }

    return last;
}

用于:

vec.erase( Unique( vec.begin(), vec.end() ), vec.end() );

对于较小的数据集,实现的简单性和不需要额外的分配可能会抵消使用额外的set 理论上更高的复杂性.不过,使用具有代表性的输入进行测量是唯一确定的方法.

For smaller data sets, the implementation simplicity and lack of extra allocation required may offset the theoretical higher complexity of using an additional set. Measurement with a representative input is the only way to be sure, though.

相关文章