Java流惰性vs融合vs短路
我正在尝试对 Java 流 API 中惰性求值的应用形成简洁一致的理解.
I'm trying to form a cocise and conherent understanding of the application of lazy evaluation within the Java streams API.
这是我目前的理解:
- 元素仅在需要时被消耗,即流是惰性的,中间操作是惰性的,例如过滤器,仅在需要时过滤.
- 中间操作可以融合在一起(如果它们是无状态的).
- 短路操作不需要处理整个流.
我想做的是将所有这些想法整合在一起,并确保我没有歪曲任何内容.我觉得这很棘手,因为每当我阅读任何关于 Java 流的文献时,它都会继续说它们是惰性的或使用惰性求值,然后非常交替地开始谈论诸如融合和短路之类的优化.
What I want to do is bring all these ideas together and ensure I'm not misrepresenting anything. I'm finding it tricky because whenever I read any literature on Java streams, it goes on to say they're lazy or utilise lazy evaluation, and then very much interchangeably starts talking about optimisations such as fusion and short-circuiting.
那么我这样说对吗?
fusion 是在流 API 中实现惰性求值的方式——即消耗一个元素,并且尽可能将操作融合在一起.我在想,如果不存在融合,那么我们肯定会回到热切评估,因为替代方案只是在进行下一个中间操作之前处理每个中间操作的所有元素?
fusion is how lazy evaluation has been implemented in the stream API - i.e. an element is consumed, and operations are fused together wherever possible. I'm thinking that if fusion didn't exist then surely we'd be back to eager evaluation as the alternative would just be to process all elements for each intermediate operation before moving onto the next?
在没有融合或惰性求值的情况下,短路是可能的,但在流的上下文中,这两个原则的实现有很大帮助?
short-circuiting is possible without fusion or lazy evaluation but is very much helped in the context of streams by these the implementation of these two principles?
如果能对此有任何进一步的见解和澄清,我将不胜感激.
I'd appreciate any further insight and clarity on this.
推荐答案
至于融合.让我们想象一下这是一个 map
操作:
As for fusion. Let's imagine here's a map
operation:
.map(x -> x.squash())
它是无状态的,它只是根据指定的算法转换任何输入(在我们的例子中是压缩它们).现在进行过滤操作:
It's stateless and it just transforms any input according to the specified algorithm (in our case squashes them). Now the filter operation:
.filter(x -> x.getColor() != YELLOW)
它也是无状态的,它只是删除了一些元素(在我们的例子中是黄色的).现在让我们进行终端操作:
It's also stateless and it just removes some elements (in our case yellow ones). Now let's have a terminal operation:
.forEach(System.out::println)
它只是向终端显示输入元素.融合意味着所有中间无状态操作都与终端消费者合并为单个操作:
It just displays the input elements to the terminal. The fusion means that all intermediate stateless operations are merged with terminal consumer into single operation:
.map(x -> x.squash())
.filter(x -> x.getColor() != YELLOW)
.forEach(System.out::println)
整个管道被融合成一个Consumer
,它直接连接到源.当处理每个元素时,源拆分器只执行组合消费者,流管道不拦截任何内容,也不执行任何额外的簿记.那是融合.融合不依赖于短路.可以在没有融合的情况下实现流(执行一个操作,获取结果,执行下一个操作,在每个操作后将控制权返回给流引擎).也可以在不短路的情况下进行融合.
The whole pipeline is fused into single Consumer
which is connected directly to the source. When every single element is processed, the source spliterator just executes the combined consumer, the stream pipeline does not intercept anything and does not perform any additional bookkeeping. That's fusion. Fusion does not depend on short-circuiting. It's possible to implement streams without fusion (execute one operation, take the result, execute the next operation, taking the control after each operation back to the stream engine). It's also possible to have fusion without short-circuiting.
相关文章