python set.__contains__ 的意外行为
问题描述
从 __contains__
文档中借用文档
print set.__contains__.__doc__
x.__contains__(y) <==> y in x.
这似乎适用于 int、basestring 等原始对象.但对于定义 __ne__
和 __eq__
方法的用户定义对象,我感到意外行为.这是一个示例代码:
This seems to work fine for primitive objects such as int, basestring, etc. But for user-defined objects that define the __ne__
and __eq__
methods, I get unexpected behavior. Here is a sample code:
class CA(object):
def __init__(self,name):
self.name = name
def __eq__(self,other):
if self.name == other.name:
return True
return False
def __ne__(self,other):
return not self.__eq__(other)
obj1 = CA('hello')
obj2 = CA('hello')
theList = [obj1,]
theSet = set(theList)
# Test 1: list
print (obj2 in theList) # return True
# Test 2: set weird
print (obj2 in theSet) # return False unexpected
# Test 3: iterating over the set
found = False
for x in theSet:
if x == obj2:
found = True
print found # return True
# Test 4: Typcasting the set to a list
print (obj2 in list(theSet)) # return True
那么这是一个错误还是一个功能?
So is this a bug or a feature?
解决方案
对于set
s和dicts
,需要定义__hash__
.任何两个相等的对象都应该散列相同,以便在 set
s 和 dicts
中获得一致/预期的行为.
For set
s and dicts
, you need to define __hash__
. Any two objects that are equal should hash the same in order to get consistent / expected behavior in set
s and dicts
.
我建议使用 _key
方法,然后在需要比较项目部分的任何地方引用它,就像从 调用
而不是重新实现它:__eq__
__ne__
I would reccomend using a _key
method, and then just referencing that anywhere you need the part of the item to compare, just as you call __eq__
from __ne__
instead of reimplementing it:
class CA(object):
def __init__(self,name):
self.name = name
def _key(self):
return type(self), self.name
def __hash__(self):
return hash(self._key())
def __eq__(self,other):
if self._key() == other._key():
return True
return False
def __ne__(self,other):
return not self.__eq__(other)
相关文章