Optional <T> 的 GC 开销在爪哇
我们都知道,Java 中分配的每个对象都会为未来的垃圾回收周期增加一个权重,Optional
对象也不例外.我们经常使用这些对象来包装 nullable,这会导致代码更安全,但代价是什么?
有没有人知道添加什么样的额外 GC 压力可选对象与简单地返回 null 以及这对高吞吐量系统的性能有什么样的影响?
解决方案我们都知道,在 Java 中分配的每个对象都会为未来的垃圾回收周期增加一个权重,...
这听起来像是没有人可以否认的说法,但让我们看看垃圾收集器的实际工作,考虑现代 JVM 的常见实现以及分配对象对其的影响,尤其是像 Optional
通常具有临时性质的实例.
垃圾收集器的首要任务是识别还活着的对象.垃圾收集器"这个名称侧重于识别垃圾,但垃圾被定义为无法访问的对象,找出哪些对象无法访问的唯一方法是通过消除过程.所以第一个任务是通过遍历和标记所有可达对象来解决的.所以这个过程的成本不取决于分配对象的总量,而只取决于那些仍然可以访问的对象.
第二个任务是使垃圾内存可用于新的分配.所有现代垃圾收集器都不会为仍然可访问的对象之间的内存间隙感到困惑,而是通过清除一个完整区域、将所有具有该内存的活动对象转移到一个新位置并调整对它们的引用来工作.在此过程之后,内存可用于整个块的新分配.所以这又是一个过程,其成本不取决于已分配对象的总量,而仅取决于(部分)仍然活着的对象.
因此,像临时可选
这样的对象,如果在两个垃圾回收周期之间分配和放弃,则可能对实际的垃圾回收过程没有任何成本.
当然是一抓.每次分配都会减少可用于后续分配的内存,直到没有剩余空间并且必须进行垃圾收集.所以我们可以说,每次分配减少了两次垃圾收集运行之间的时间,分配空间的大小除以对象大小.这不仅是很小的一部分,而且也只适用于单线程场景.
在 Hotspot JVM 等实现中,每个线程都使用线程本地分配缓冲区 (TLAB) 来存储新对象.一旦它的 TLAB 已满,它将从分配空间(也称为 Eden 空间)中获取一个新的.如果没有可用的,将触发垃圾回收.现在,所有线程不太可能同时到达其 TLAB 的末尾.因此,对于此时在其 TLAB 中仍有一些空间的其他线程,如果它们分配了更多仍然适合该剩余空间的对象,则不会有任何区别.
也许令人惊讶的结论是,并非每个分配的对象都会对垃圾回收产生影响,即由不触发下一次 gc 的线程分配的纯本地对象可能是完全空闲的.p>
当然,这不适用于分配大量对象.分配大量的 TLAB 会导致线程分配更多的 TLAB 并最终触发垃圾回收.这就是为什么我们有像 IntStream
这样的类允许在不分配对象的情况下处理大量元素,就像 Stream
一样,而提供结果作为单个 OptionalInt
实例.正如我们现在所知,单个临时对象对 gc 的影响很小(如果有的话).
这甚至没有触及 JVM 的优化器,如果逃逸分析证明对象是纯本地的,它可能会消除热点中的对象分配.
We all know that every object allocated in Java adds a weight into future garbage collection cycles, and Optional<T>
objects are no different. We use these objects frequently to wrap nullable, which leads to safer code, but at what cost?
Does anyone have information on what kind of additional GC pressure optional objects add vs. simply returning nulls and what kind of impact this has on performance in high-throughput systems?
解决方案We all know that every object allocated in Java adds a weight into future garbage collection cycles,…
That sounds like a statement nobody could deny, but let’s look at the actual work of a garbage collector, considering common implementations of modern JVMs and the impact of an allocated object on it, especially objects like Optional
instances which are typically of a temporary nature.
The first task of the garbage collector is to identify objects which are still alive. The name "garbage collector" puts a focus on identifying garbage, but garbage is defined as unreachable objects and the only way to find out which objects are unreachable, is via the process of elimination. So the first task is solved by traversing and marking all reachable objects. So the costs of this process do not depend on the total amount of allocated objects, but only those, which are still reachable.
The second task is to make the memory of the garbage available to new allocations. Instead of puzzling with the memory gaps between still reachable objects, all modern garbage collectors work by evacuating a complete region, transferring all alive objects withing that memory to a new location and adapting the references to them. After the process, the memory is available to new allocations as a whole block. So this is again a process whose costs do not depend on the total amount of allocated objects, but only (a part of) the still alive objects.
Therefore, an object like a temporary Optional
may impose no costs on the actual garbage collection process at all, if it is allocated and abandoned between two garbage collection cycles.
With one catch, of course. Each allocation will reduce the memory available to subsequent allocations until there’s no space left and the garbage collection has to take place. So we could say, each allocation reduces the time between two garbage collection runs by the size of the allocation space divided by the object size. Not only is this a rather tiny fraction, it also only applies to a single threaded scenario.
In implementations like the Hotspot JVM, each thread uses a thread local allocation buffer (TLAB) for new objects. Once its TLAB is full, it will fetch a new one from the allocation space (aka Eden space). If there is none available, a garbage collection will be triggered. Now it’s rather unlikely that all threads hit the end of their TLAB right at the same time. So for the other threads which still have some space in their TLAB left at this time, it would not make any difference if they had allocated some more objects still fitting in that remaining space.
The perhaps surprising conclusion is that not every allocated object has an impact on the garbage collection, i.e. a purely local object allocated by a thread not triggering the next gc, could be entirely free.
Of course, this does not apply to allocating a large amount of objects. Allocating lots of them causes the thread to allocate more TLABs and eventually trigger the garbage collection earlier than without. That’s why we have classes like IntStream
allowing to process a large number of elements without allocating objects, as would happen with a Stream<Integer>
, while there is no problem in providing the result as a single OptionalInt
instance. As we know now, a single temporary object has only a tiny impact on the gc, if any.
This did not even touch the JVM’s optimizer, which may eliminate object allocations in hot spots, if Escape Analysis has proven that the object is purely local.
相关文章