在哪些情况下流操作应该是有状态的?

2022-01-22 00:00:00 java-8 java java-stream

在 javaodoc 为 <代码>流包,在Parallelism部分的末尾,我读到:

In the javaodoc for the stream package, at the end of the section Parallelism, I read:

大多数流操作都接受描述用户指定行为的参数,这些参数通常是 lambda 表达式.为了保持正确的行为,这些行为参数必须是无干扰的,并且在大多数情况下必须是无状态的.

Most stream operations accept parameters that describe user-specified behavior, which are often lambda expressions. To preserve correct behavior, these behavioral parameters must be non-interfering, and in most cases must be stateless.

我很难理解这个在大多数情况下".在哪些情况下可以接受/需要有状态的流操作?

I have hard time understanding this "in most cases". In which cases is it acceptable/desirable to have a stateful stream operation?

我的意思是,我知道这是可能的,特别是在使用顺序流时,但同一个 javadoc 明确指出:

I mean, I know it is possible, specially when using sequential streams, but the same javadoc clearly states:

除了标识为明确非确定性的操作,例如 findAny(),流是顺序执行还是并行执行不应改变计算结果.

Except for operations identified as explicitly nondeterministic, such as findAny(), whether a stream executes sequentially or in parallel should not change the result of the computation.

还有:

另请注意,尝试从行为参数访问可变状态会使您在安全性和性能方面做出错误的选择;[...] 最好的方法是避免有状态的行为参数来完全流式操作;通常有一种方法可以重组流管道以避免有状态.

Note also that attempting to access mutable state from behavioral parameters presents you with a bad choice with respect to safety and performance; [...] The best approach is to avoid stateful behavioral parameters to stream operations entirely; there is usually a way to restructure the stream pipeline to avoid statefulness.

所以,我的问题是:在什么情况下使用有状态的流操作是一个好习惯(而不是通过副作用工作的方法,例如 forEach)?

So, my question is: in which circumstances is it a good practice to use a stateful stream operation (and not for methods working by side-effect, such as forEach)?

一个相关的问题可能是:为什么会有副作用工作,例如 forEach?我总是做一个很好的旧 for 循环以避免在我的 lambda 表达式中产生副作用.

A related question could be: why are there operations working by side effect, such as forEach? I always end up doing a good old for loop to avoid having side-effects in my lambda expression.

推荐答案

有状态流 lambda 示例:

Examples of stateful stream lambdas:

  • collect(Collector):Collector 根据定义是有状态的,因为它必须收集集合(状态)中的所有元素.
  • forEach(Consumer):Consumer 根据定义是有状态的,除非它是一个黑洞(无操作).
  • peek(Consumer):Consumer 根据定义是有状态的,因为如果不将其存储在某处(例如日志),为什么要偷看.
  • collect(Collector): The Collector is by definition stateful, since it has to collect all the elements in a collection (state).
  • forEach(Consumer): The Consumer is by definition stateful, well except if it's a black hole (no-op).
  • peek(Consumer): The Consumer is by definition stateful, because why peek if not to store it somewhere (e.g. log).

所以,CollectorConsumer 是两个定义为有状态的 lambda 接口.

So, Collector and Consumer are two lambda interfaces that by definition are stateful.

所有其他的,例如PredicateFunctionUnaryOperatorBinaryOperatorComparator,应该 是无国籍的.

All the others, e.g. Predicate, Function, UnaryOperator, BinaryOperator, and Comparator, should be stateless.

相关文章