numpy.unique 为集合列表提供错误的输出
问题描述
我有一个由 给出的集合列表,
I have a list of sets given by,
sets1 = [{1},{2},{1}]
当我使用 numpy 的 unique
在此列表中找到唯一元素时,我得到
When I find the unique elements in this list using numpy's unique
, I get
np.unique(sets1)
Out[18]: array([{1}, {2}, {1}], dtype=object)
可以看出,结果是错误的,因为 {1}
在输出中重复.
As can be seen seen, the result is wrong as {1}
is repeated in the output.
当我通过使相似元素相邻来更改输入中的顺序时,这不会发生.
When I change the order in the input by making similar elements adjacent, this doesn't happen.
sets2 = [{1},{1},{2}]
np.unique(sets2)
Out[21]: array([{1}, {2}], dtype=object)
为什么会发生这种情况?还是我的做法有问题?
Why does this occur? Or is there something wrong in the way I have done?
解决方案
这里发生的是 np.unique
函数是基于 np._unique1d
函数的NumPy(参见代码这里),其中本身使用 .sort()
方法.
What happens here is that the np.unique
function is based on the np._unique1d
function from NumPy (see the code here), which itself uses the .sort()
method.
现在,对每个集合中仅包含一个整数的集合列表进行排序不会生成一个列表,其中每个集合都按集合中存在的整数值排序.所以我们将拥有(这不是我们想要的):
Now, sorting a list of sets that contain only one integer in each set will not result in a list with each set ordered by the value of the integer present in the set. So we will have (and that is not what we want):
sets = [{1},{2},{1}]
sets.sort()
print(sets)
# > [{1},{2},{1}]
# ie. the list has not been "sorted" like we want it to
现在,正如您所指出的,如果集合列表已经按照您想要的方式排序,则 np.unique
将起作用(因为您会事先对列表进行排序).
Now, as you have pointed out, if the list of sets is already ordered in the way you want, np.unique
will work (since you would have sorted the list beforehand).
一个特定的解决方案(但请注意,它仅适用于每个包含单个整数的集合列表):
One specific solution (though, please be aware that it will only work for a list of sets that each contain a single integer) would then be:
np.unique(sorted(sets, key=lambda x: next(iter(x))))
相关文章