官方文档哪里说Java的并行流操作使用fork/join?

这是我对 Stream 框架:

  1. 某事创建了一个源 流
  2. 实现负责提供一个BaseStream#parallel() 方法,该方法又返回一个可以并行运行其操作的 Stream.

<小时>

虽然有人已经找到了使用方法一个具有 Stream 框架并行执行的自定义线程池,我在 Java 8 API 中找不到任何提及默认 Java 8 并行 Stream 实现将使用 ForkJoinPool#commonPool().(Collection#parallelStream(),StreamSupport 类中的方法,以及其他我不知道的 API 中支持并行的流的可能来源).


While someone has already found a way to use a custom thread pool with Stream framework's parallel executions, I cannot for the life of me find any mention in the Java 8 API that the default Java 8 parallel Stream implementations would use ForkJoinPool#commonPool(). (Collection#parallelStream(), the methods in StreamSupport class, and others possible sources of parallel-enabled streams in the API that I don't know about).

我能从搜索结果中看到的只有这些花絮:

Only tidbits that I could gleam off search results were these:

  • Lambda 状态:库版本 (引擎盖下的并行性")
    含糊地提到了 Stream 框架和 Fork/Join 机制.

  • State of the Lambda: Libraries Edition ("Parallelism under the hood")
    Vaguely mentions the Stream framework and the Fork/Join machinery.

Fork/Join 机制旨在自动化此过程.

  • JEP 107:集合的批量数据操作
    几乎直接表明 Collection 接口的默认方法 #parallelStream() 使用 Fork/Join 实现自身.但是对于普通池仍然一无所知.

  • JEP 107: Bulk Data Operations for Collections
    Almost directly states that the the Collection interface's default method #parallelStream() implements itself using Fork/Join. But still nothing about common pool.

    并行实现基于 Java 7 中引入的 java.util.concurrency Fork/Join 实现.

    因此:Collection#parallelStream().

    类数组 (Javadoc)
    直接多次声明使用公共池.

    Class Arrays (Javadoc)
    Directly states multiple times that the common pool is used.

    ForkJoin 公共池用于执行任何并行任务.

  • 所以我的问题是:

    在哪里说 ForkJoinPool#commonPool() 用于对从 Java 8 API 获取的流进行并行操作?

    Where is it said that the ForkJoinPool#commonPool() is used for parallel operations on streams that are obtained from the Java 8 API?

    推荐答案

    W.r.t.Java 8 并行流使用 FJ 框架的记录在哪里?

    Afaik (Java 1.8u5) 在并行流的 JavaDoc 中没有提到使用通用的 ForkJoinPool.

    Afaik (Java 1.8u5) it is not mentioned in the JavaDoc of parallel streams that a common ForkJoinPool is used.

    但在底部的 ForkJoin 文档中提到了http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

    But it is mentioned in the ForkJoin documentation at the bottom of http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

    W.r.t.替换线程池

    我的理解是你可以使用自定义的 ForkJoinPool (而不是普通的)- 参见 Java 8 并行流中的自定义线程池 -,但不是与 ForkJoin 实现不同的自定义 ThreadPool(我在这里有一个悬而未决的问题:如何(全局)替换Java并行流的通用线程池后端?)

    My understanding is that you can use a custom ForkJoinPool (instead of the common one) - see Custom thread pool in Java 8 parallel stream -, but not a custom ThreadPool which is different from the ForkJoin implementation (I have an open question here: How to (globally) replace the common thread pool backend of Java parallel streams? )

    W.r.t.替换 Streams api

    您可以查看 https://github.com/nurkiewicz/LazySeq,它更像 Scala流实现 - 非常好,非常有趣

    You may checkout https://github.com/nurkiewicz/LazySeq which is a more Scala like streams implementation - very nice, very interesting

    PS(w.r.t. ForkJoin 和 Streams)

    如果您有兴趣,我想指出我在使用 FJ 池时偶然发现了一些问题,请参阅,例如

    If you are interested, I would like to note that I stumbled across some issues with the use of the FJ pool, see, e.g.

    • 嵌套 Java 8 并行 forEach循环表现不佳.这是预期的行为吗?
    • 使用嵌套 Java 8 并行流操作中的信号量可能会死锁.这是一个错误吗?

    相关文章