了解 Python 中的生成器
问题描述
我目前正在阅读 Python 食谱,目前正在研究生成器.我发现自己的脑袋很难转.
I am reading the Python cookbook at the moment and am currently looking at generators. I'm finding it hard to get my head round.
由于我来自 Java 背景,是否有 Java 等价物?这本书讲的是生产者/消费者",但是当我听到我想到线程时.
As I come from a Java background, is there a Java equivalent? The book was speaking about 'Producer / Consumer', however when I hear that I think of threading.
什么是生成器,为什么要使用它?显然,无需引用任何书籍(除非您可以直接从书中找到一个体面、简单的答案).如果您觉得大方,也许可以举一些例子!
What is a generator and why would you use it? Without quoting any books, obviously (unless you can find a decent, simplistic answer direct from a book). Perhaps with examples, if you're feeling generous!
解决方案
注意:本文假定使用 Python 3.x 语法.†
generator 只是一个返回对象的函数可以调用 next
,这样每次调用都会返回一些值,直到引发 StopIteration
异常,表明所有值都已生成.这样的对象称为迭代器.
A generator is simply a function which returns an object on which you can call next
, such that for every call it returns some value, until it raises a StopIteration
exception, signaling that all values have been generated. Such an object is called an iterator.
普通函数使用 return
返回单个值,就像在 Java 中一样.然而,在 Python 中,有一个替代方法,称为 yield
.在函数中的任何地方使用 yield
都会使其成为生成器.观察这段代码:
Normal functions return a single value using return
, just like in Java. In Python, however, there is an alternative, called yield
. Using yield
anywhere in a function makes it a generator. Observe this code:
>>> def myGen(n):
... yield n
... yield n + 1
...
>>> g = myGen(6)
>>> next(g)
6
>>> next(g)
7
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
如您所见,myGen(n)
是一个产生 n
和 n + 1
的函数.每次调用 next
都会产生一个值, 直到所有值都已产生.for
循环在后台调用 next
,因此:
As you can see, myGen(n)
is a function which yields n
and n + 1
. Every call to next
yields a single value, until all values have been yielded. for
loops call next
in the background, thus:
>>> for n in myGen(6):
... print(n)
...
6
7
同样有 生成器表达式,它提供了一种简洁描述某些常见类型生成器的方法:
Likewise there are generator expressions, which provide a means to succinctly describe certain common types of generators:
>>> g = (n for n in range(3, 5))
>>> next(g)
3
>>> next(g)
4
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
请注意,生成器表达式很像 列表推导:
Note that generator expressions are much like list comprehensions:
>>> lc = [n for n in range(3, 5)]
>>> lc
[3, 4]
观察生成器对象生成一次,但其代码不是一次性运行.只有调用 next
才能真正执行(部分)代码.一旦到达 yield
语句,生成器中代码的执行就会停止,并返回一个值.对 next
的下一次调用会导致执行在最后一个 yield
之后离开生成器的状态下继续执行.这是与常规函数的根本区别:它们总是从顶部"开始执行并在返回值时丢弃它们的状态.
Observe that a generator object is generated once, but its code is not run all at once. Only calls to next
actually execute (part of) the code. Execution of the code in a generator stops once a yield
statement has been reached, upon which it returns a value. The next call to next
then causes execution to continue in the state in which the generator was left after the last yield
. This is a fundamental difference with regular functions: those always start execution at the "top" and discard their state upon returning a value.
关于这个话题还有很多话要说.例如可以将数据发送
回生成器(reference).但我建议您在了解生成器的基本概念之前不要研究这些内容.
There are more things to be said about this subject. It is e.g. possible to send
data back into a generator (reference). But that is something I suggest you do not look into until you understand the basic concept of a generator.
现在您可能会问:为什么要使用生成器?有几个很好的理由:
Now you may ask: why use generators? There are a couple of good reasons:
- 使用生成器可以更简洁地描述某些概念.
- 与其创建一个返回值列表的函数,不如编写一个动态生成值的生成器.这意味着不需要构造列表,这意味着生成的代码更节省内存.通过这种方式,我们甚至可以描述因太大而无法放入内存的数据流.
生成器允许以自然的方式描述无限流.例如考虑 斐波那契数:
>>> def fib():
... a, b = 0, 1
... while True:
... yield a
... a, b = b, a + b
...
>>> import itertools
>>> list(itertools.islice(fib(), 10))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
此代码使用 itertools.islice
从无限流中获取有限数量的元素.建议您仔细查看 itertools
中的函数 模块,因为它们是轻松编写高级生成器的必备工具.
This code uses itertools.islice
to take a finite number of elements from an infinite stream. You are advised to have a good look at the functions in the itertools
module, as they are essential tools for writing advanced generators with great ease.
† 关于 Python <=2.6: 在上面的例子中 next
是一个调用方法 __next__
在给定的对象上.在 Python <=2.6 中,使用了一种稍微不同的技术,即 o.next()
而不是 next(o)
.Python 2.7 有 next()
调用 .next
所以你不需要在 2.7 中使用以下内容:
† About Python <=2.6: in the above examples next
is a function which calls the method __next__
on the given object. In Python <=2.6 one uses a slightly different technique, namely o.next()
instead of next(o)
. Python 2.7 has next()
call .next
so you need not use the following in 2.7:
>>> g = (n for n in range(3, 5))
>>> g.next()
3
相关文章