计算 Stream 的元素

2022-01-22 00:00:00 count java-8 java java-stream collectors

我想计算一个流的不同元素,想知道为什么

I want to count the different elements of a stream and am wondering why

Stream<String> stream = Stream.of("a", "b", "a", "c", "c", "a", "a", "d");
Map<String, Integer> counter1 = stream.collect(Collectors.toMap(s -> s, 1, Integer::sum));

不起作用.Eclipse 告诉我

doesn't work. Eclipse tells me

类型Collectors中的toMap(Function, Function, BinaryOperator)方法不适用于参数((s) -> {}, int, Integer::sum)

The method toMap(Function, Function, BinaryOperator) in the type Collectors is not applicable for the arguments (( s) -> {}, int, Integer::sum)

顺便说一句,我知道那个解决方案:

By the way, I know about that solution:

Map<String, Long> counter2 = stream.collect(Collectors.groupingBy(s -> s, Collectors.counting()));

所以我有两个问题:

  1. 我的第一种方法有什么错误?
  2. 您将如何实现这样的计数器?

我自己解决了第一个问题:

I solved the first question by myself:

Map<String, Integer> counter1 = stream.collect(Collectors.toMap(s -> s, s -> 1, Integer::sum)); 

Java 期望一个函数作为第二个参数.

Java is expecting a function as second argument.

推荐答案

确实有几种方法可以做到.你没有提到的是 .collect(groupingBy(x -> x, summingInt(x -> 1)));

There are indeed several ways to do it. The one you haven't mentioned is .collect(groupingBy(x -> x, summingInt(x -> 1)));

在性能上有一些差异.

如果每个存储桶的对象很少,方法 #1 将处于最佳状态.在每个桶只有 1 个对象的理想情况下,您可以立即获得最终映射,而无需修改条目.在有大量重复对象的最坏情况下,它将不得不进行大量装箱/拆箱.

Approach #1 is going to be at its best if there are very few objects per bucket. In the ideal case of only 1 object per bucket, you end up with the final map right away with no need to modify the entries. In the worst case of having a very large number of repeated objects, it will have to do a lot of boxing/unboxing.

方法 #2 依赖于 counting() 收集器,它没有具体说明它应该如何进行计数.当前实现转发到 reducing 但这可能会改变.

Approach #2 relies on counting() collector, which doesn't specify exactly how it should do the counting. The current implementation forwards to reducing but that might change.

summingInt 方法将在 int 而不是 Integer 中累积计数,因此不需要任何装箱/拆箱.如果对象重复很多次,那将是最好的.

The summingInt approach will accumulate the count in int rather than Integer and thus will not require any boxing/unboxing. It will be at its best if objects repeat a very large number of times.

至于选择哪一个,最好把代码写清楚,必要时再优化.对我来说,groupingBy(x->x,counting()) 最清楚地表达了意图,所以这是我喜欢的.

As for which one to choose, it is best to code for clarity and optimize when it becomes necessary. To me, groupingBy(x->x, counting()) expresses the intent most clearly, so that's the one I would favor.

相关文章