切片列表时,Python 是否复制对对象的引用?

2022-01-20 00:00:00 python 复制 list slice memory

问题描述

当一个列表被切片时,对其内容的引用是从原始列表中复制的吗?我可以想象这可能没有必要,但我读到了相反的内容(顺便提及).

When a list is sliced, are the references to its contents copied from the original list? I can imagine that this may not be necessary, but I read the opposite (mentioned in passing).

对于很长的my_list,这个问题例如对于以下成语很重要:

This question matters for instance for the following idiom, in the case of a very long my_list:

for (first_elmt, second_elmt) in itertools.izip(my_list[:-1], my_list[1:]):
    …

副本会占用内存,并且可能会占用一些时间.我比较了用 xrange() 在 1 亿个整数列表上循环 first_elmt 的索引.切片方法实际上快了 20%,但似乎确实复制了引用(系统时间更长).真的是这样吗?

A copy would use up both memory and, presumably, some time. I compared to looping over the index of first_elmt with xrange(), on a list of 100 million integers. The slicing approach is actually 20% faster, but does seem to copy references (the system time is longer). Is this indeed the case?

PS:我现在意识到切片复制引用是很自然的:如果原始列表被修改,切片不会改变,所以更容易实现切片复制原始列表的引用.不过,指向 CPython 实现的指针会很有趣.

PS: I now realize that it is quite natural that slices copy the references: if the original list is modified, the slice does not change, so it is easier to have the implementation of the slice copy the references of the original list. A pointer to the CPython implementation would be interesting, though.


解决方案

切片会复制引用.如果您有一个包含 1 亿个事物的列表:

Slicing will copy the references. If you have a list of 100 million things:

l = [object() for i in xrange(100000000)]

然后你切一片:

l2 = l[:-1]

l2 将拥有自己的 99,999,999 个指针的后备数组,而不是共享 l 的数组.但是,这些指针引用的对象不会被复制:

l2 will have its own backing array of 99,999,999 pointers, rather than sharing l's array. However, the objects those pointers refer to are not copied:

>>> l2[0] is l[0]
True

如果您想迭代列表的重叠元素对而不进行复制,您可以使用已前进一位的迭代器zip列表:

If you want to iterate over overlapping pairs of elements of a list without making a copy, you can zip the list with an iterator that has been advanced one position:

second_items = iter(l)
next(second_items, None) # Avoid exception on empty input
for thing1, thing2 in itertools.izip(l, second_items):
    whatever()

这利用了 zip 在任何输入迭代器停止时停止这一事实.这可以扩展到您已经使用 itertools 使用迭代器的情况.三通

This takes advantage of the fact that zip stops when any input iterator stops. This can be extended to cases where you're already working with an iterator using itertools.tee

i1, i2 = itertools.tee(iterator)
next(i2, None)
for thing1, thing2 in itertools.izip(i1, i2):
    whatever()

相关文章