Python.对象集合中的标识.和散列
问题描述
__hash__ 和 __eq__ 如何用于集合中的标识?例如一些有助于解决多米诺骨牌难题的代码:
How do __hash__ and __eq__ use in identification in sets? For example some code that should help to solve some domino puzzle:
class foo(object):
def __init__(self, one, two):
self.one = one
self.two = two
def __eq__(self,other):
if (self.one == other.one) and (self.two == other.two): return True
if (self.two == other.one) and (self.one == other.two): return True
return False
def __hash__(self):
return hash(self.one + self.two)
s = set()
for i in range(7):
for j in range(7):
s.add(foo(i,j))
len(s) // returns 28 Why?
如果我只使用 __eq__() len(s) 等于 49.没关系,因为据我了解对象(例如 1-2 和 2-1)不一样,但代表相同的多米诺骨牌.所以我添加了哈希函数.
现在它按我想要的方式工作,但我不明白一件事:1-3 和 2-2 的哈希值应该相同,因此它们应该算作相同的对象并且不应该添加到集合中.但他们做到了!我卡住了.
If i use only __eq__() len(s) equals 49. Its ok because as i understand objects (1-2 and 2-1 for example) not same, but represent same domino. So I have added hash function.
Now it works the way i want, but i did not understand one thing: hash of 1-3 and 2-2 should be same so they should counted like same object and shouldn't added to set. But they do! Im stuck.
解决方案
dict/set 的相等性取决于 __eq__
定义的相等性.但是,要求比较相等的对象具有相同的哈希值,这就是为什么您需要 __hash__
.请参阅 this question 了解一些类似问题讨论.
Equality for dict/set purposes depends on equality as defined by __eq__
. However, it is required that objects that compare equal have the same hash value, and that is why you need __hash__
. See this question for some similar discussion.
哈希本身并不能确定两个对象在字典中是否计为相同.散列就像一条捷径",只能以一种方式起作用:如果两个对象具有不同的散列,则它们肯定不相等;但如果它们具有相同的哈希值,它们仍然可能不相等.
The hash itself does not determine whether two objects count as the same in dictionaries. The hash is like a "shortcut" that only works one way: if two objects have different hashes, they are definitely not equal; but if they have the same hash, they still might not be equal.
在您的示例中,您定义了 __hash__
和 __eq__
来做不同的事情.哈希仅取决于多米诺骨牌上数字的总和,但相等性取决于两个单独的数字(按顺序).这是合法的,因为相同的多米诺骨牌仍然具有相同的哈希值.然而,就像我上面所说的,这并不意味着等和多米诺骨牌会被认为是平等的.一些不相等的多米诺骨牌仍然具有相等的哈希值.但是相等仍然由 __eq__
决定,而 __eq__
仍然按顺序查看这两个数字,所以这就是决定它们是否相等的原因.
In your example, you defined __hash__
and __eq__
to do different things. The hash depends only on the sum of the numbers on the domino, but the equality depends on both individual numbers (in order). This is legal, since it is still the case that equal dominoes have equal hashes. However, like I said above, it doesn't mean that equal-sum dominoes will be considered equal. Some unequal dominoes will still have equal hashes. But equality is still determined by __eq__
, and __eq__
still looks at both numbers, in order, so that's what determines whether they are equal.
在我看来,在您的情况下,适当的做法是将 __hash__
和 __eq__
都定义为依赖于 ordered 对--- 即先比较两个数中较大的,再比较较小的.这意味着 2-1 和 1-2 将被视为相同.
It seems to me that the appropriate thing to do in your case is to define both __hash__
and __eq__
to depend on the ordered pair --- that is, first compare the greater of the two numbers, then compare the lesser. This will mean that 2-1 and 1-2 will be considered the same.
相关文章