了解 Python 中的生成器

2022-01-30 00:00:00 python generator

问题描述

我目前正在阅读 Python 食谱,目前正在研究生成器.我发现自己的脑袋很难转.

I am reading the Python cookbook at the moment and am currently looking at generators. I'm finding it hard to get my head round.

由于我来自 Java 背景,是否有 Java 等价物?这本书讲的是生产者/消费者",但是当我听到我想到线程时.

As I come from a Java background, is there a Java equivalent? The book was speaking about 'Producer / Consumer', however when I hear that I think of threading.

什么是生成器,为什么要使用它?显然,无需引用任何书籍(除非您可以直接从书中找到一个体面、简单的答案).如果您觉得大方,也许可以举一些例子!

What is a generator and why would you use it? Without quoting any books, obviously (unless you can find a decent, simplistic answer direct from a book). Perhaps with examples, if you're feeling generous!


解决方案

注意:本文假定使用 Python 3.x 语法.

generator 只是一个返回对象的函数可以调用 next,这样每次调用都会返回一些值,直到引发 StopIteration 异常,表明所有值都已生成.这样的对象称为迭代器.

A generator is simply a function which returns an object on which you can call next, such that for every call it returns some value, until it raises a StopIteration exception, signaling that all values have been generated. Such an object is called an iterator.

普通函数使用 return 返回单个值,就像在 Java 中一样.然而,在 Python 中,有一个替代方法,称为 yield.在函数中的任何地方使用 yield 都会使其成为生成器.观察这段代码:

Normal functions return a single value using return, just like in Java. In Python, however, there is an alternative, called yield. Using yield anywhere in a function makes it a generator. Observe this code:

>>> def myGen(n):
...     yield n
...     yield n + 1
... 
>>> g = myGen(6)
>>> next(g)
6
>>> next(g)
7
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

如您所见,myGen(n) 是一个产生 nn + 1 的函数.每次调用 next 都会产生一个值, 直到所有值都已产生.for 循环在后台调用 next,因此:

As you can see, myGen(n) is a function which yields n and n + 1. Every call to next yields a single value, until all values have been yielded. for loops call next in the background, thus:

>>> for n in myGen(6):
...     print(n)
... 
6
7

同样有 生成器表达式,它提供了一种简洁描述某些常见类型生成器的方法:

Likewise there are generator expressions, which provide a means to succinctly describe certain common types of generators:

>>> g = (n for n in range(3, 5))
>>> next(g)
3
>>> next(g)
4
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

请注意,生成器表达式很像 列表推导:

Note that generator expressions are much like list comprehensions:

>>> lc = [n for n in range(3, 5)]
>>> lc
[3, 4]

观察生成器对象生成一次,但其代码不是一次性运行.只有调用 next 才能真正执行(部分)代码.一旦到达 yield 语句,生成器中代码的执行就会停止,并返回一个值.对 next 的下一次调用会导致执行在最后一个 yield 之后离开生成器的状态下继续执行.这是与常规函数的根本区别:它们总是从顶部"开始执行并在返回值时丢弃它们的状态.

Observe that a generator object is generated once, but its code is not run all at once. Only calls to next actually execute (part of) the code. Execution of the code in a generator stops once a yield statement has been reached, upon which it returns a value. The next call to next then causes execution to continue in the state in which the generator was left after the last yield. This is a fundamental difference with regular functions: those always start execution at the "top" and discard their state upon returning a value.

关于这个话题还有很多话要说.例如可以将数据发送回生成器(reference).但我建议您在了解生成器的基本概念之前不要研究这些内容.

There are more things to be said about this subject. It is e.g. possible to send data back into a generator (reference). But that is something I suggest you do not look into until you understand the basic concept of a generator.

现在您可能会问:为什么要使用生成器?有几个很好的理由:

Now you may ask: why use generators? There are a couple of good reasons:

  • 使用生成器可以更简洁地描述某些概念.
  • 与其创建一个返回值列表的函数,不如编写一个动态生成值的生成器.这意味着不需要构造列表,这意味着生成的代码更节省内存.通过这种方式,我们甚至可以描述因太大而无法放入内存的数据流.
  • 生成器允许以自然的方式描述无限流.例如考虑 斐波那契数:

>>> def fib():
...     a, b = 0, 1
...     while True:
...         yield a
...         a, b = b, a + b
... 
>>> import itertools
>>> list(itertools.islice(fib(), 10))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

此代码使用 itertools.islice 从无限流中获取有限数量的元素.建议您仔细查看 itertools 中的函数 模块,因为它们是轻松编写高级生成器的必备工具.

This code uses itertools.islice to take a finite number of elements from an infinite stream. You are advised to have a good look at the functions in the itertools module, as they are essential tools for writing advanced generators with great ease.

   关于 Python <=2.6: 在上面的例子中 next 是一个调用方法 __next__ 在给定的对象上.在 Python <=2.6 中,使用了一种稍微不同的技术,即 o.next() 而不是 next(o).Python 2.7 有 next() 调用 .next 所以你不需要在 2.7 中使用以下内容:

   About Python <=2.6: in the above examples next is a function which calls the method __next__ on the given object. In Python <=2.6 one uses a slightly different technique, namely o.next() instead of next(o). Python 2.7 has next() call .next so you need not use the following in 2.7:

>>> g = (n for n in range(3, 5))
>>> g.next()
3

相关文章