在使用 python 迭代列表时修改列表

2022-01-24 00:00:00 python python-3.x iteration

问题描述

当我想修改原始列表时,我知道要遍历列表的副本.然而,我收到的关于在迭代列表时修改列表有什么问题的唯一解释是它可能导致意想不到的结果".

I know to iterate over a copy of my list when I want to modify the original. However, the only explanation I've ever received on what's wrong with modifying a list while iterating over it is that "it can lead to unexpected results."

考虑以下几点:

lst = ['a', 'b', 'c', 'd', 'e']
for x in lst:
    lst.remove(x)
print(lst)

这是我试图解释在迭代列表时修改列表时实际发生的情况.注意 line2 等价于 for i in range(len(lst)):,并且 len(lst) 每次迭代都会减 1.

Here is my attempt at explaining what actually happens when one modifies a list while iterating over it. Note that line2 is equivalent to for i in range(len(lst)):, and that len(lst) decreases by 1 with every iteration.

len(lst) 以 5 开头.

i = 0 时,我们有 lst[i] = 'a' 被移除,所以 lst = ['b', 'c','d', 'e'].len(lst) 减少到 4.

When i = 0, we have lst[i] = 'a' being removed, so lst = ['b', 'c', 'd', 'e']. len(lst) decreases to 4.

i = 1 时,我们有 lst[i] = 'c' 被移除,所以 lst = ['b', 'd','e']len(lst) 减少到 3.

When i = 1, we have lst[i] = 'c' being removed, so lst = ['b', 'd', 'e'] len(lst) decreases to 3.

i = 2 时,我们会移除 lst[i] = 'e',所以 lst = ['b', 'd'].len(lst) 减少到 2.

When i = 2, we have lst[i] = 'e' being removed, so lst = ['b', 'd']. len(lst) decreases to 2.

这是我认为会引发 IndexError 的地方,因为 i = 2 不在 range(2) 中.然而,程序只是简单地输出 ['b', 'd'].是因为 i 已经赶上"了 len(lst) 吗?另外,到目前为止我的推理是否合理?

This is where I thought an IndexError would be raised, since i = 2 is not in range(2). However, the program simply outputs ['b', 'd']. Is it because i has "caught up" with len(lst)? Also, is my reasoning sound so far?


解决方案

C实现在listiter_next函数中blob/master/Objects/listobject.c" rel="nofollow noreferrer">listobject.c 相关行是

The C implementation is in the listiter_next function in listobject.c and the pertinent lines are

if (it->it_index < PyList_GET_SIZE(seq)) {
    item = PyList_GET_ITEM(seq, it->it_index);
    ++it->it_index;
    Py_INCREF(item);
    return item;
}

it->it_seq = NULL;
Py_DECREF(seq);
return NULL;

如果对象仍在范围内 (it->it_index < PyList_GET_SIZE(seq)),则迭代器返回一个对象,否则返回 NONE.不管你偏离 1 还是 100 万,这都不是错误.

The iterator returns an object if its still in range (it->it_index < PyList_GET_SIZE(seq)) and returns NONE otherwise. It doesn't matter if you are off by 1 or a million, its not an error.

这样做的一般原因是迭代器和可迭代对象可以在多个地方使用(考虑在 for 循环中读取的文件对象).外循环不应该仅仅因为没有事情要做就因为 IndexError 而崩溃.更改您正在迭代的对象并不违法或本质上愚蠢",只是您需要知道您的行为的后果.

The general reason for doing things this way is that iterators and iterables can be consumed in multiple places (consider a file object that is read inside a for loop). An outer loop shouldn't crash with an IndexError just because its run out of things to do. Its not illegal or inherently "stupid" to change an object you are iterating, its just that you need to know the consequences of your actions.

相关文章