在 Python 中使用 enumerate 遍历列表时是否应该创建副本
问题描述
在回答这个问题时,我遇到了在 Python 中从未想过的事情(由用户指出).
When answering this question, I came across something I never thought about in Python (pointed by a user).
基本上,我已经知道(这里有一个有趣的线程)我必须在迭代时复制在 Python 中改变一个列表以避免奇怪的行为.
Basically, I already know (here's an interesting thread about it) that I have to make a copy when iterating while mutating a list in Python in order to avoid strange behaviors.
现在,我的问题是,使用 enumerate
是否可以解决这个问题?
Now, my question is, is using enumerate
overcoming that problem ?
test_list = [1,2,3,4]
for index,item in enumerate(test_list):
if item == 1:
test_list.pop(index)
这段代码会被认为是安全的还是我应该使用,
Would this code be considered safe or I should use,
for index,item in enumerate(test_list[:]):
解决方案
首先,让我们回答您的直接问题:
First, let’s answer your direct question:
enumerate
在这里没有任何帮助.它的工作方式就好像它拥有一个指向底层可迭代对象的迭代器(并且,至少在 CPython 中,这正是它所做的),所以任何对列表迭代器不合法或不安全的事情都是不合法或不安全的使用围绕该列表迭代器的枚举对象.
enumerate
doesn’t help anything here. It works as if it held an iterator to the underlying iterable (and, at least in CPython, that’s exactly what it does), so anything that wouldn’t be legal or safe to do with a list iterator isn’t legal or safe to do with an enumerate object wrapped around that list iterator.
您的原始用例——设置 test_list[index] = new_value
——在实践中是安全的——但我不确定它是否保证是安全的.
Your original use case—setting test_list[index] = new_value
—is safe in practice—but I’m not sure whether it’s guaranteed to be safe or not.
您的新用例——调用 test_list.pop(index)
——可能不安全.
Your new use case—calling test_list.pop(index)
—is probably not safe.
列表迭代器最明显的实现基本上只是对列表的引用和对该列表的索引.因此,如果您在当前位置或该位置的左侧插入或删除,您肯定会破坏迭代器.例如,如果你删除 lst[i]
,这会将所有内容从 i + 1
移动到最后一个位置,所以当你继续移动到 i +1
,你跳过了原来的第 i + 1
th 值,因为它现在是第 i
th.但是如果你在当前位置的右边插入或删除,那就没问题了.
The most obvious implementation of a list iterator is basically just a reference to the list and an index into that list. So, if you insert or delete at the current position, or to the left of that position, you’re definitely breaking the iterator. For example, if you delete lst[i]
, that shifts everything from i + 1
to the end up one position, so when you move on to i + 1
, you’re skipping over the original i + 1
th value, because it’s now the i
th. But if you insert or delete to the right of the current position, that’s not a problem.
由于 test_list.pop(index)
在当前位置或左侧删除,因此即使使用此实现也不安全.(当然,如果您已经仔细编写了算法,以便在命中后跳过该值无关紧要,即使这样也可以.但是更多的算法无法处理.)
Since test_list.pop(index)
deletes at or left of the current position, it's not safe even with this implementation. (Of course if you've carefully written your algorithm so that skipping over the value after a hit never matters, maybe even that's fine. But more algorithms won't handle that.)
可以想象,Python 实现可以存储一个原始指针,指向用于列表存储的数组中的当前位置.这意味着插入 anywhere 可能会破坏迭代器,因为插入会导致整个列表重新分配到新内存.如果实现有时会在缩小时重新分配列表,那么可以在任何地方删除.我不认为 Python 不允许执行所有这些操作,因此,如果您想偏执,在迭代时从不插入或删除可能会更安全.
It’s conceivable that a Python implementation could instead store a raw pointer to the current position in the array used for the list storage. Which would mean that inserting anywhere could break the iterator, because an insert can cause the whole list to get reallocated to new memory. And so could deleting anywhere, if the implementation sometimes reallocates lists on shrinking. I don't think the Python disallows implementations that do all of this, so if you want to be paranoid, it may be safer to just never insert or delete while iterating.
如果您只是替换现有值,很难想象在任何合理的实现下这会如何破坏迭代器.但是,据我所知,语言参考和 list
库参考1 实际上并没有对列表迭代器的实现做出任何承诺.2
If you’re just replacing an existing value, it’s hard to imagine how that could break the iterator under any reasonable implementation. But, as far as I'm aware, the language reference and list
library reference1 don't actually make any promises about the implementation of list iterators.2
因此,您是否关心我的实现安全"、每个迄今为止编写的实现都安全"、每个可能的(对我)实现的安全"还是由参考保证安全"取决于您".
So, it's up to you whether you care about "safe in my implementation", "safe in every implementation every written to date", "safe in every conceivable (to me) implementation", or "guaranteed safe by the reference".
我认为大多数人在迭代期间很乐意替换列表项,但要避免缩小或扩大列表.但是,肯定有生产代码至少删除了迭代器的右侧.
I think most people happily replace list items during iteration, but avoid shrinking or growing the list. However, there's definitely production code out there that at least deletes to the right of the iterator.
1.我相信 tutorial 只是在某处说在迭代时永远不要修改任何数据结构——但这就是教程.始终遵循该规则当然是安全的,但遵循不太严格的规则也可能是安全的.
1. I believe the tutorial just says somewhere to never modify any data structure while iterating over it—but that’s the tutorial. It’s certainly safe to always follow that rule, but it may also be safe to follow a less strict rule.
2.除非 key
函数或其他任何东西试图在 sort
,结果未定义.
2. Except that if the key
function or anything else tries to access the list in any way in the middle of a sort
, the result is undefined.
相关文章