减少 JVM 暂停时间 >1 秒使用 UseConcMarkSweepGC

我正在一台具有 16Gb RAM、一个 8 核处理器和 Java 1.6 的机器上运行一个内存密集型应用程序,所有这些都在 CentOS 版本 5.2(最终版)上运行.确切的 JVM 详细信息是:

I'm running a memory intensive app on a machine with 16Gb of RAM, and an 8-core processor, and Java 1.6 all running on CentOS release 5.2 (Final). Exact JVM details are:

java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode)

我正在使用以下命令行选项启动应用程序:

I'm launching the app with the following command line options:

java -XX:+UseConcMarkSweepGC -verbose:gc -server -Xmx10g -Xms10g ...

我的应用程序公开了一个 JSON-RPC API,我的目标是在 25 毫秒内响应请求.不幸的是,我看到延迟达到并超过 1 秒,这似乎是由垃圾收集引起的.以下是一些较长的示例:

My application exposes a JSON-RPC API, and my goal is to respond to requests within 25ms. Unfortunately, I'm seeing delays up to and exceeding 1 second and it appears to be caused by garbage collection. Here are some of the longer examples:

[GC 4592788K->4462162K(10468736K), 1.3606660 secs]
[GC 5881547K->5768559K(10468736K), 1.2559860 secs]
[GC 6045823K->5914115K(10468736K), 1.3250050 secs]

这些垃圾收集事件中的每一个都伴随着延迟的 API 响应,其持续时间与所示垃圾收集的长度非常相似(在几毫秒内).

Each of these garbage collection events was accompanied by a delayed API response of very similar duration to the length of the garbage collection shown (to within a few ms).

这里有一些典型的例子(这些都是在几秒钟内产生的):

Here are some typical examples (these were all produced within a few seconds):

[GC 3373764K->3336654K(10468736K), 0.6677560 secs]
[GC 3472974K->3427592K(10468736K), 0.5059650 secs]
[GC 3563912K->3517273K(10468736K), 0.6844440 secs]
[GC 3622292K->3589011K(10468736K), 0.4528480 secs]

问题是我认为 UseConcMarkSweepGC 会避免这种情况,或者至少让它变得非常罕见.相反,超过 100 毫秒的延迟几乎每分钟发生一次或更多(尽管超过 1 秒的延迟相当罕见,可能每 10 或 15 分钟一次).

The thing is that I thought the UseConcMarkSweepGC would avoid this, or at least make it extremely rare. On the contrary, delays exceeding 100ms are occurring almost once a minute or more (although delays of over 1 second are considerably rarer, perhaps once every 10 or 15 minutes).

另一件事是,我认为只有 FULL GC 会导致线程暂停,但这些似乎不是 full GC.

The other thing is that I thought only a FULL GC would cause threads to be paused, yet these don't appear to be full GCs.

需要注意的是,大部分内存都被使用软引用的 LRU 内存缓存占用.

It may be relevant to note that most of the memory is occupied by a LRU memory cache that makes use of soft references.

任何帮助或建议将不胜感激.

Any assistance or advice would be greatly appreciated.

推荐答案

原来堆的一部分被换出到磁盘,所以垃圾收集不得不将一堆数据从磁盘拉回内存.

Turns out that part of the heap was getting swapped out to disk, so that garbage collection had to pull a bunch of data off the disk back into memory.

我通过将 Linux 的swappiness"参数设置为 0 解决了这个问题(这样它就不会将数据交换到磁盘上).

I resolved this by setting Linux's "swappiness" parameter to 0 (so that it wouldn't swap data out to disk).

相关文章