分组 Java8 流而不收集它
Java 8 中是否有任何方法可以对 java.util.stream.Stream
中的元素进行分组而不收集它们?我希望结果再次成为 Stream
.因为我必须处理大量数据甚至无限流,所以我无法先收集数据并再次流式传输结果.
Is there any way in Java 8 to group the elements in a java.util.stream.Stream
without collecting them? I want the result to be a Stream
again. Because I have to work with a lot of data or even infinite streams, I cannot collect the data first and stream the result again.
所有需要分组的元素在第一个流中都是连续的.因此,我喜欢保持流评估惰性.
All elements that need to be grouped are consecutive in the first stream. Therefore I like to keep the stream evaluation lazy.
推荐答案
没有办法使用标准的 Stream API.一般来说,您不能这样做,因为将来总有可能出现属于任何已创建组的新项目,因此在处理所有输入之前,您不能将您的组传递给下游分析.
There's no way to do it using standard Stream API. In general you cannot do it as it's always possible that new item will appear in future which belongs to any of already created groups, so you cannot pass your group to downstream analysis until you process all the input.
但是,如果您事先知道要分组的项目在输入流中总是相邻的,您可以使用增强 Stream API 的第三方库来解决您的问题.其中一个库是 StreamEx,它是免费的,由我编写.它包含许多部分归约"运算符,它们根据某些谓词将相邻项折叠成单个项.通常你应该提供一个 BiPredicate
来测试两个相邻的项目,如果它们应该被组合在一起,则返回 true.下面列出了一些部分归约操作:
However if you know in advance that items to be grouped are always adjacent in input stream, you can solve your problem using third-party libraries enhancing Stream API. One of such libraries is StreamEx which is free and written by me. It contains a number of "partial reduction" operators which collapse adjacent items into single based on some predicate. Usually you should supply a BiPredicate
which tests two adjacent items and returns true if they should be grouped together. Some of partial reduction operations are listed below:
collapse(BiPredicate)
:用组的第一个元素替换每个组.例如,collapse(Objects::equals)
对于从流中删除相邻的重复项很有用.groupRuns(BiPredicate)
:将每个组替换为组元素列表(因此StreamEx
转换为 StreamEx
- ).例如,
stringStream.groupRuns((a, b) -> a.charAt(0) == b.charAt(0))
将创建字符串列表流,其中每个列表都包含相邻的字符串以相同的字母开头.
collapse(BiPredicate)
: replace each group with the first element of the group. For example,collapse(Objects::equals)
is useful to remove adjacent duplicates from the stream.groupRuns(BiPredicate)
: replace each group with the List of group elements (soStreamEx<T>
is converted toStreamEx<List<T>>
). For example,stringStream.groupRuns((a, b) -> a.charAt(0) == b.charAt(0))
will create stream of Lists of strings where each list contains adjacent strings started with the same letter.
其他部分归约操作包括 intervalMap
, runLengths()
等等.
Other partial reduction operations include intervalMap
, runLengths()
and so on.
所有部分归约操作都是惰性的、对并行友好且非常高效.
All partial reduction operations are lazy, parallel-friendly and quite efficient.
请注意,您可以使用 StreamEx.of(stream)
从常规 Java 8 流轻松构造 StreamEx
对象.也有从数组、Collection、Reader等构造的方法.StreamEx
类实现Stream
接口,100%兼容标准Stream API.
Note that you can easily construct a StreamEx
object from regular Java 8 stream using StreamEx.of(stream)
. Also there are methods to construct it from array, Collection, Reader, etc. The StreamEx
class implements Stream
interface and 100% compatible with standard Stream API.
相关文章