支持在 HotSpot JVM 中删除压缩字符串?

2022-01-16 00:00:00 performance jvm java java-7

在此 Oracle 页面上 Java HotSpot VM 选项,它会将 -XX:+UseCompressedStrings 列为可用且默认开启.但是在 Java 6 update 29 中,默认情况下它是关闭的,而在 Java 7 update 2 中它会报告警告

On this Oracle page Java HotSpot VM Options, it lists -XX:+UseCompressedStrings as being available and on by default. However in Java 6 update 29, it is off by default and in Java 7 update 2 it reports a warning

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseCompressedStrings; support was removed in 7.0

有人知道删除这个选项背后的想法吗?

Does anyone know the thinking behind removing this option?

对一个巨大文件的行进行排序.java中的txt

使用 -mx2g,此示例在 Java 6 更新 29 中启用该选项需要 4.541 秒,禁用该选项需要 5.206 秒.很难看出它会影响性能.

With -mx2g, this example took 4.541 seconds with the option on and 5.206 second with it off in Java 6 update 29. It is hard to see that it impacts performance.

注意:Java 7 更新 2 需要 2.0 G,而没有压缩字符串的 Java 6 更新 29 需要 1.8 GB,压缩字符串只需要 1.0 GB.

Note: Java 7 update 2 requires 2.0 G whereas Java 6 update 29 without compressed strings requires 1.8 GB and with compressed string requires only 1.0 GB.

推荐答案

最初,添加此选项是为了提高 SPECjBB 性能.这些增益是由于处理器和 DRAM 之间的内存带宽要求降低所致.在 byte[] 中加载和存储字节消耗的带宽是 char[] 中 char 的 1/2.

Originally, this option was added to improve SPECjBB performance. The gains are due to reduced memory bandwidth requirements between the processor and DRAM. Loading and storing bytes in the byte[] consumes 1/2 the bandwidth versus chars in the char[].

但是,这是有代价的.代码必须确定内部数组是 byte[] 还是 char[].这需要 CPU 时间,如果工作负载不受内存带宽限制,则可能会导致性能下降.由于增加了复杂性,还有代码维护费用.

However, this comes at a price. The code has to determine if the internal array is a byte[] or char[]. This takes CPU time and if the workload is not memory bandwidth constrained, it can cause a performance regression. There is also a code maintenance price due to the added complexity.

因为没有足够多的类似生产的工作负载显示出显着的收益(也许 SPECjBB 除外),所以删除了该选项.

Because there weren't enough production-like workloads that showed significant gains (except perhaps SPECjBB), the option was removed.

这还有另一个角度.该选项减少了堆使用.对于适用的字符串,它将这些字符串的内存使用量减少了 1/2.在移除选项时没有考虑这个角度.对于内存容量受限的工作负载(即必须在有限的堆空间下运行并且 GC 需要大量时间),此选项可能很有用.

There is another angle to this. The option reduces heap usage. For applicable Strings, it reduces the memory usage of those Strings by 1/2. This angle wasn't considered at the time of option removal. For workloads that are memory capacity constrained (i.e. have to run with limited heap space and GC takes a lot of time), this option can prove useful.

如果可以找到足够的内存容量受限的类似生产的工作负载来证明包含该选项的合理性,那么也许该选项将被恢复.

If enough memory capacity constrained production-like workloads can be found to justify the option's inclusion, then maybe the option will be brought back.

2013 年 3 月 20 日服务器堆转储平均使用 25% 的字符串空间.大多数字符串都是可压缩的.如果重新引入该选项,它可以节省一半的空间(例如~12%)!

Edit 3/20/2013: An average server heap dump uses 25% of the space on Strings. Most Strings are compressible. If the option is reintroduced, it could save half of this space (e.g. ~12%)!

2016 年 3 月 10 日 JDK 9 中将恢复类似于压缩字符串的功能 JEP 254.

Edit 3/10/2016: A feature similar to compressed strings is coming back in JDK 9 JEP 254.

相关文章