Java 8 的迭代器与流

为了利用 Jdk 8 的 java.util.stream 中包含的各种查询方法,我试图设计与 * 有关系的领域模型多重性(具有零个或多个实例)返回 Stream,而不是 IterableIterator.

To take advantage of the wide range of query methods included in java.util.stream of Jdk 8 I am attempted to design domain models where getters of relationship with * multiplicity (with zero or more instances ) return a Stream<T>, instead of an Iterable<T> or Iterator<T>.

我怀疑 Stream<T>Iterator<T> 相比是否会产生任何额外的开销?

My doubt is if there is any additional overhead incurred by the Stream<T> in comparison to the Iterator<T>?

那么,使用 Stream<T> 破坏我的域模型有什么缺点吗?

So, is there any disadvantage of compromising my domain model with a Stream<T>?

或者,我应该始终返回 Iterator 还是 Iterable,并让最终用户决定是否使用通过使用 StreamUtils?

Or instead, should I always return an Iterator<T> or Iterable<T>, and leave to the end-user the decision of choosing whether to use a stream, or not, by converting that iterator with the StreamUtils?

注意返回 Collection 不是一个有效的选项,因为在这种情况下,大多数关系都是惰性的并且大小未知.

Note that returning a Collection is not a valid option because in this case most of the relationships are lazy and with unknown size.

推荐答案

这里有很多性能建议,但遗憾的是,其中大部分是猜测,很少有真正的性能考虑.

There's lots of performance advice here, but sadly much of it is guesswork, and little of it points to the real performance considerations.

@Holger 正确指出我们应该抵制看似压倒性的趋势,让性能尾巴摇摆不定API设计狗.

@Holger gets it right by pointing out that we should resist the seemingly overwhelming tendency to let the performance tail wag the API design dog.

虽然在任何给定情况下,有无数的考虑因素可以使流比其他形式的遍历更慢、相同或更快,但有一些因素表明流在重要的地方具有性能优势 -- 在大数据集上.

While there are a zillion considerations that can make a stream slower than, the same as, or faster than some other form of traversal in any given case, there are some factors that point to streams haven a performance advantage where it counts -- on big data sets.

与创建 Iterator 相比,创建 Stream 有一些额外的固定启动开销——在开始计算之前需要多一些对象.如果你的数据集很大,没关系;这是一个通过大量计算摊销的小型启动成本.(如果您的数据集很小,这也可能无关紧要 - 因为如果您的程序在小型数据集上运行,那么性能通常也不是您的第一大问题.)在哪里问题是什么时候平行;建立管道所花费的任何时间都属于阿姆达尔定律的序列部分;如果您查看实现,我们会努力在流设置期间保持对象计数,但我很乐意找到减少它的方法,因为这会直接影响并行开始胜出的盈亏平衡数据集大小顺序的.

There is some additional fixed startup overhead of creating a Stream compared to creating an Iterator -- a few more objects before you start calculating. If your data set is large, it doesn't matter; it's a small startup cost amortized over a lot of computation. (And if your data set is small, it probably also doesn't matter -- because if your program is operating on small data sets, performance is generally not your #1 concern either.) Where this does matter is when going parallel; any time spent setting up the pipeline goes into the serial fraction of Amdahl's law; if you look at the implementation, we work hard to keep the object count down during stream setup, but I'd be happy to find ways to reduce it as that has a direct effect on the breakeven data set size where parallel starts to win over sequential.

但是,比固定启动成本更重要的是每个元素的访问成本.在这里,流媒体实际上赢了——而且经常赢大——有些人可能会感到惊讶.(在我们的性能测试中,我们经常看到流管道在 Collection 对应物上的性能优于其 for-loop.)而且,对此有一个简单的解释:Spliterator 从根本上降低了每个元素的访问成本高于 Iterator,甚至是顺序访问.有几个原因.

But, more important than the fixed startup cost is the per-element access cost. Here, streams actually win -- and often win big -- which some may find surprising. (In our performance tests, we routinely see stream pipelines which can outperform their for-loop over Collection counterparts.) And, there's a simple explanation for this: Spliterator has fundamentally lower per-element access costs than Iterator, even sequentially. There are several reasons for this.

  1. 迭代器协议从根本上来说效率较低.它需要调用两个方法来获取每个元素.此外,因为迭代器必须对诸如在没有 hasNext() 的情况下调用 next() 或在没有 的情况下多次调用 hasNext() 等事情具有鲁棒性next(),这两种方法通常都必须进行一些防御性编码(通常还有更多的状态和分支),这会增加效率.另一方面,即使是遍历拆分器 (tryAdvance) 的慢速方法也没有这种负担.(对于并发数据结构来说更糟,因为 next/hasNext 对偶性从根本上说是活泼的,而 Iterator 实现必须做更多的工作来保护与 Spliterator 实现相比,防止并发修改.)

  1. The Iterator protocol is fundamentally less efficient. It requires calling two methods to get each element. Further, because Iterators must be robust to things like calling next() without hasNext(), or hasNext() multiple times without next(), both of these methods generally have to do some defensive coding (and generally more statefulness and branching), which adds to inefficiency. On the other hand, even the slow way to traverse a spliterator (tryAdvance) doesn't have this burden. (It's even worse for concurrent data structures, because the next/hasNext duality is fundamentally racy, and Iterator implementations have to do more work to defend against concurrent modifications than do Spliterator implementations.)

Spliterator 进一步提供了快速路径"迭代——forEachRemaining——大部分时间都可以使用(reduction,forEach),进一步减少调解访问数据结构内部的迭代代码的开销.这也倾向于很好地内联,从而提高其他优化的有效性,例如代码移动、边界检查消除等.

Spliterator further offers a "fast-path" iteration -- forEachRemaining -- which can be used most of the time (reduction, forEach), further reducing the overhead of the iteration code that mediates access to the data structure internals. This also tends to inline very well, which in turn increases the effectiveness of other optimizations such as code motion, bounds check elimination, etc.

此外,通过 Spliterator 的遍历往往比使用 Iterator 的堆写入要少得多.使用 Iterator,每个元素都会导致一次或多次堆写入(除非 Iterator 可以通过转义分析进行标量化,并将其字段提升到寄存器中.)除其他问题外,这会导致 GC卡标记活动,导致卡标记的缓存行争用.另一方面,Spliterators 往往具有较少的状态,工业强度的 forEachRemaining 实现倾向于推迟将任何内容写入堆直到遍历结束,而不是存储它的本地迭代状态自然映射到寄存器,从而减少内存总线活动.

Further, traversal via Spliterator tend to have many fewer heap writes than with Iterator. With Iterator, every element causes one or more heap writes (unless the Iterator can be scalarized via escape analysis and its fields hoisted into registers.) Among other issues, this causes GC card mark activity, leading to cache line contention for the card marks. On the other hand, Spliterators tend to have less state, and industrial-strength forEachRemaining implementations tend to defer writing anything to the heap until the end of the traversal, instead storing its iteration state in locals which naturally map to registers, resulting in reduced memory bus activity.

总结:别担心,开心就好.Spliterator 是一个更好的 Iterator,即使没有并行性.(它们通常也更容易编写,更难出错.)

Summary: don't worry, be happy. Spliterator is a better Iterator, even without parallelism. (They're also generally just easier to write and harder to get wrong.)

相关文章