Python中切片的高效迭代

2022-01-24 00:00:00 python slice performance iteration

问题描述

Python 中的切片操作的迭代效率如何?如果切片是不可避免的,还有其他选择吗?

How efficient are iterations over slice operations in Python? And if a copy is inevitable with slices, is there an alternative?

我知道对列表的切片操作是 O(k),其中 k 是切片的大小.

I know that a slice operation over a list is O(k), where k is the size of the slice.

x[5 : 5+k]  # O(k) copy operation

但是,当迭代列表的一部分时,我发现最干净(也是最 Pythonic?)的方法(无需求助于索引)是:

However, when iterating over a part of a list, I find that the cleanest (and most Pythonic?) way to do this (without having to resort to indices) is to do:

for elem in x[5 : 5+k]:
  print elem

但是我的直觉是,这仍然会导致子列表的昂贵副本,而不是简单地迭代现有列表.

However my intuition is that this still results in an expensive copy of the sublist, rather than simply iterating over the existing list.


解决方案

您可以使用 itertools.islice 从列表中获取切片迭代器:

You can use itertools.islice to get a sliced iterator from the list:

例子:

>>> from itertools import islice
>>> lis = range(20)
>>> for x in islice(lis, 10, None, 1):
...     print x
...     
10
11
12
13
14
15
16
17
18
19

更新:

正如@user2357112 所指出的,islice 的性能取决于切片的起点和可迭代的大小,普通切片几乎在所有情况下都会很快,应该是首选.以下是更多时间比较:

Update:

As noted by @user2357112 the performance of islice depends on the start point of slice and the size of the iterable, normal slice is going to be fast in almost all cases and should be preferred. Here are some more timing comparisons:

对于 Huge lists islice 当切片的起点小于列表大小的一半时,islice 略快或等于普通切片,对于更大的索引,普通切片是明确的赢家.

For Huge lists islice is slightly faster or equal to normal slice when the slice's start point is less than half the size of list, for bigger indexes normal slice is the clear winner.

>>> def func(lis, n):
        it = iter(lis)
        for x in islice(it, n, None, 1):pass
...     
>>> def func1(lis, n):
        #it = iter(lis)
        for x in islice(lis, n, None, 1):pass
...     
>>> def func2(lis, n):
        for x in lis[n:]:pass
...     
>>> lis = range(10**6)

>>> n = 100
>>> %timeit func(lis, n)
10 loops, best of 3: 62.1 ms per loop
>>> %timeit func1(lis, n)
1 loops, best of 3: 60.8 ms per loop
>>> %timeit func2(lis, n)
1 loops, best of 3: 82.8 ms per loop

>>> n = 1000
>>> %timeit func(lis, n)
10 loops, best of 3: 64.4 ms per loop
>>> %timeit func1(lis, n)
1 loops, best of 3: 60.3 ms per loop
>>> %timeit func2(lis, n)
1 loops, best of 3: 85.8 ms per loop

>>> n = 10**4
>>> %timeit func(lis, n)
10 loops, best of 3: 61.4 ms per loop
>>> %timeit func1(lis, n)
10 loops, best of 3: 61 ms per loop
>>> %timeit func2(lis, n)
1 loops, best of 3: 80.8 ms per loop


>>> n = (10**6)/2
>>> %timeit func(lis, n)
10 loops, best of 3: 39.2 ms per loop
>>> %timeit func1(lis, n)
10 loops, best of 3: 39.6 ms per loop
>>> %timeit func2(lis, n)
10 loops, best of 3: 41.5 ms per loop

>>> n = (10**6)-1000
>>> %timeit func(lis, n)
100 loops, best of 3: 18.9 ms per loop
>>> %timeit func1(lis, n)
100 loops, best of 3: 18.8 ms per loop
>>> %timeit func2(lis, n)
10000 loops, best of 3: 50.9 us per loop    #clear winner for large index
>>> %timeit func1(lis, n)

对于小型列表,在几乎所有情况下,普通切片都比 islice 快.

For Small lists normal slice is faster than islice for almost all cases.

>>> lis = range(1000)
>>> n = 100
>>> %timeit func(lis, n)
10000 loops, best of 3: 60.7 us per loop
>>> %timeit func1(lis, n)
10000 loops, best of 3: 59.6 us per loop
>>> %timeit func2(lis, n)
10000 loops, best of 3: 59.9 us per loop

>>> n = 500
>>> %timeit func(lis, n)
10000 loops, best of 3: 38.4 us per loop
>>> %timeit func1(lis, n)
10000 loops, best of 3: 33.9 us per loop
>>> %timeit func2(lis, n)
10000 loops, best of 3: 26.6 us per loop

>>> n = 900
>>> %timeit func(lis, n)
10000 loops, best of 3: 20.1 us per loop
>>> %timeit func1(lis, n)
10000 loops, best of 3: 17.2 us per loop
>>> %timeit func2(lis, n)
10000 loops, best of 3: 11.3 us per loop

结论:

去普通切片.

相关文章