在运算符中,float(“NaN") 和 np.nan
问题描述
我曾经相信 Python 中的 in
运算符使用相等性检查 ==
来检查某个集合中元素的存在,所以 element in some_list
大致相当于 any(x == element for x in some_list)
.例如:
I used to believe that in
operator in Python checks the presence of element in some collection using equality checking ==
, so element in some_list
is roughly equivalent to any(x == element for x in some_list)
. For example:
True in [1, 2, 3]
# True because True == 1
或
1 in [1., 2., 3.]
# also True because 1 == 1.
然而,众所周知 NaN
不等于自身.所以我预计 [float("NaN")] 中的 float("NaN") 是
False
.确实是False
.
However, it is well-known that NaN
is not equal to itself. So I expected that float("NaN") in [float("NaN")]
is False
. And it is False
indeed.
但是,如果我们使用 numpy.nan
而不是 float("NaN")
,情况就大不相同了:
However, if we use numpy.nan
instead of float("NaN")
, the situation is quite different:
import numpy as np
np.nan in [np.nan, 1, 2]
# True
但是 np.nan == np.nan
仍然给出 False
!
这怎么可能?np.nan
和 float("NaN")
有什么区别?in
如何处理np.nan
?
How is it possible? What's the difference between np.nan
and float("NaN")
? How does in
deal with np.nan
?
解决方案
为了检查项目是否在列表中,Python 测试对象身份首先,然后仅测试对象是否相等是不同的.1
To check if the item is in the list, Python tests for object identity first, and then tests for equality only if the objects are different.1
float("NaN") 为 False,因为在比较.因此,身份测试返回 False,然后相等性测试也返回 False,因为
NaN != NaN
.
np.nan in [np.nan, 1, 2]
然而是 True 因为 same NaN
对象参与比较.对象身份的测试返回 True,因此 Python 立即将项目识别为在列表中.
np.nan in [np.nan, 1, 2]
however is True because the same NaN
object is involved in the comparison. The test for object identity returns True and so Python immediately recognises the item as being in the list.
Python 的许多其他内置容器类型(例如元组和集合)的 __contains__
方法(使用 in
调用)使用相同的检查来实现.
The __contains__
method (invoked using in
) for many of Python's other builtin Container types, such as tuples and sets, is implemented using the same check.
1 至少在 CPython 中是这样.这里的对象标识意味着在相同的内存地址找到对象:包含列表方法 使用 PyObject_RichCompareBool
在可能更复杂的对象比较之前快速比较对象指针.其他 Python 实现可能会有所不同.
1 At least this is true in CPython. Object identity here means that the objects are found at the same memory address: the contains method for lists is performed using PyObject_RichCompareBool
which quickly compares object pointers before a potentially more complicated object comparison. Other Python implementations may differ.
相关文章