Python 3.6.5 “是"和“=="对于超出缓存间隔的整数

2022-01-25 00:00:00 python reference object compare

问题描述

我想先说一下我知道 ==is 之间的区别,一个是引用,另一个是对象.我也知道 python 在启动时缓存 (-5, 256) 范围内的整数,因此在将它们与 is 进行比较时它们应该可以工作.

I want to preface this by saying that I know the difference between == and is one is for references and the other is for objects. I also know that python caches the integers in the range (-5, 256) at startup so they should work when comparing them with is.

但是我看到了一个奇怪的行为.

However I have seen a strange behaviour.

>>> 2**7 is 2**7
True
>>> 2**10 is 2**10
False

这是意料之中的,2**71282**101024,一个在区间 (-5, 256) 中,另一个不在.

This is to be expected, 2**7 is 128 and 2**10 is 1024, one is in the interval (-5, 256) and the other is not.

不过……

>>> 10000000000000000000000000000000000000000 is 10000000000000000000000000000000000000000
True

为什么返回 True?这显然是一个高于任何类型缓存间隔的值,并且 2**10 is 2**10 清楚地表明 is 实际上不适用于 以上的整数256.那么……为什么会这样呢?

Why does this return True? It is obviously a value WAY above any kind of caching interval and 2**10 is 2**10 clearly showed that is does actually not work on integers above 256. So... why does this happen?


解决方案

CPython 检测代码中的常量值并重用它们以节省内存.这些常量存储在代码对象中,甚至可以从python内部访问:

CPython detects constant values in your code and re-uses them to save memory. These constants are stored on code objects, and can even be accessed from within python:

>>> codeobj = compile('999 is 999', '<stdin>', 'exec')
>>> codeobj
<code object <module> at 0x7fec489ef420, file "<stdin>", line 1>
>>> codeobj.co_consts
(999, None)

你的 is 的两个操作数都引用这个相同的 999 整数.我们可以通过使用 dis 模块:

Both operands of your is refer to this very same 999 integer. We can confirm this by dissecting the code with the dis module:

>>> dis.dis(codeobj)
  1           0 LOAD_CONST               0 (999)
              2 LOAD_CONST               0 (999)
              4 COMPARE_OP               8 (is)
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE

如您所见,前两个 LOAD_CONST 指令均加载索引为 0 的常量,即 999 号.

As you can see, the first two LOAD_CONST instructions both load the constant with index 0, which is the 999 number.

但是,这只有在两个数字同时编译时才会发生.如果您在单独的代码对象中创建每个数字,它们将不再相同:

However, this only happens if the two numbers are compiled at the same time. If you create each number in a separate code object, they will no longer be identical:

>>> x = 999
>>> x is 999
False

相关文章