使用 sum() 连接元组

2022-01-09 00:00:00 python itertools tuples sum

问题描述

从这篇文章我了解到您可以使用 sum() 连接元组:

From this post I learned that you can concatenate tuples with sum():

>>> tuples = (('hello',), ('these', 'are'), ('my', 'tuples!'))
>>> sum(tuples, ())
('hello', 'these', 'are', 'my', 'tuples!')

这看起来很不错.但为什么这行得通?而且,这是最优的,还是 itertools 中的某些东西比这个结构更可取?

Which looks pretty nice. But why does this work? And, is this optimal, or is there something from itertools that would be preferable to this construct?


解决方案

python中的加法运算符连接元组:

the addition operator concatenates tuples in python:

('a', 'b')+('c', 'd')
Out[34]: ('a', 'b', 'c', 'd')

来自sum的文档字符串:

返回一个开始"值(默认值:0)加上一个可迭代的数字

Return the sum of a 'start' value (default: 0) plus an iterable of numbers

这意味着 sum 不是从可迭代的第一个元素开始,而是从通过 start= 参数传递的初始值开始.

It means sum doesn't start with the first element of your iterable, but rather with an initial value that is passed through start= argument.

默认情况下 sum 与数字一起使用,因此默认起始值​​为 0.因此,对一个可迭代的元组求和需要从一个空元组开始.() 是一个空元组:

By default sum is used with numeric thus the default start value is 0. So summing an iterable of tuples requires to start with an empty tuple. () is an empty tuple:

type(())
Out[36]: tuple

因此工作串联.

根据性能,这里是一个比较:

As per performance, here is a comparison:

%timeit sum(tuples, ())
The slowest run took 9.40 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 285 ns per loop


%timeit tuple(it.chain.from_iterable(tuples))
The slowest run took 5.00 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 625 ns per loop

现在 t2 的大小为 10000:

Now with t2 of a size 10000:

%timeit sum(t2, ())
10 loops, best of 3: 188 ms per loop

%timeit tuple(it.chain.from_iterable(t2))
1000 loops, best of 3: 526 µs per loop

因此,如果您的元组列表很小,您不必费心.如果它是中等大小或更大的,你应该使用 itertools.

So if your list of tuples is small, you don't bother. If it's medium size or larger, you should use itertools.

相关文章