Python - 类 __hash__ 方法和集合
问题描述
我正在使用 python
类的 set()
和 __hash__
方法来防止在集合中添加相同的哈希对象.根据python 数据模型文档,set()
将相同的哈希对象视为相同的对象,只需添加一次.
I'm using set()
and __hash__
method of python
class to prevent adding same hash object in set. According to python data-model document, set()
consider same hash object as same object and just add them once.
但它的行为如下所示:
class MyClass(object):
def __hash__(self):
return 0
result = set()
result.add(MyClass())
result.add(MyClass())
print(len(result)) # len = 2
在字符串值的情况下,它可以正常工作.
While in case of string value, it works correctly.
result.add('aida')
result.add('aida')
print(len(result)) # len = 1
我的问题是:为什么相同的哈希对象在集合中不一样?
My question is: why the same hash objects are not same in set?
解决方案
您的阅读不正确.__eq__
方法用于相等性检查.文档只是声明 __hash__
值对于 a == b<的 2 个对象
a
和 b
也必须相同/code>(即 a.__eq__(b)
)为真.
Your reading is incorrect. The __eq__
method is used for equality checks. The documents just state that the __hash__
value must also be the same for 2 objects a
and b
for which a == b
(i.e. a.__eq__(b)
) is true.
这是一个常见的逻辑错误:a == b
为真 暗示 hash(a) == hash(b)
也是正确的.然而,暗示并不一定意味着等价,除了之前的 hash(a) == hash(b)
意味着 a == b
.
This is a common logic mistake: a == b
being true implies that hash(a) == hash(b)
is also true. However, an implication does not necessarily mean equivalence, that in addition to the prior, hash(a) == hash(b)
would mean that a == b
.
要使 MyClass
的所有实例彼此相等,您需要为它们提供一个 __eq__
方法;否则 Python 将改为比较它们的身份.这可能会:
To make all instances of MyClass
compare equal to each other, you need to provide an __eq__
method for them; otherwise Python will compare their identities instead. This might do:
class MyClass(object):
def __hash__(self):
return 0
def __eq__(self, other):
# another object is equal to self, iff
# it is an instance of MyClass
return isinstance(other, MyClass)
现在:
>>> result = set()
>>> result.add(MyClass())
>>> result.add(MyClass())
1
<小时>
实际上,您会将 __hash__
基于用于 __eq__
比较的对象的那些属性,例如:
In reality you'd base the __hash__
on those properties of your object that are used for __eq__
comparison, for example:
class Person
def __init__(self, name, ssn):
self.name = name
self.ssn = ssn
def __eq__(self, other):
return isinstance(other, Person) and self.ssn == other.ssn
def __hash__(self):
# use the hashcode of self.ssn since that is used
# for equality checks as well
return hash(self.ssn)
p = Person('Foo Bar', 123456789)
q = Person('Fake Name', 123456789)
print(len({p, q}) # 1
相关文章