如何知道 JVM 因 Segfault 而崩溃的原因?

我们看到 JVM 有时会因段错误而崩溃.我们在日志中看到的唯一错误如下.

We are seeing the JVM getting crashed at times with segfault. The only error we see in logs is as below.

任何人都可以通过查看以下错误跟踪来提出建议.

Anyone can suggest something by looking at the below error trace.

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fef7f1d3eb0, pid=42623, tid=0x00007feea62c8700
#
# JRE version: OpenJDK Runtime Environment (8.0_222-b10) (build 1.8.0_222-b10)
# Java VM: OpenJDK 64-Bit Server VM (25.222-b10 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# J 62683 C2 org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeObject0(Ljava/lang/Object;)V (331 bytes) @ 0x00007fef7f1d3eb0 [0x00007fef7f1d3e00+0xb0]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hsperfdata_pvappuser/hs_err_pid42623.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp

<小时>

在尝试了解此崩溃的原因时 Oracle JVM 文档 https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/crashes001.html ,这看起来是 5.1.2 Crash in Compiled Code 的问题框架是java框架(有一个J")


While trying to understand the reason for this crash Oracle JVM docs https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/crashes001.html ,this looks to be the case of 5.1.2 Crash in Compiled Code as the problematic frame is java frame(has a "J")

虽然无法进一步了解它,但我们也不确定它什么时候出现,唯一可能的模式是它在 JVM 运行 5-6 天时出现,通常是在星期五.我们使用的是在 RHEL 6.10 上运行的 RedHat 提供的 openjdk-8 ("1.8.0_232") 发行版.

Though could not get much further from it, we also not sure when it comes, the only probale pattern is it comes when JVM is running for 5-6 days so usually on Friday. We are using openjdk-8 ("1.8.0_232") distribution provided by RedHat running on RHEL 6.10.

期待在跟踪此错误时获得任何领先点.

Looking forward to get any leading point in tracing this error.

推荐答案

当前栈帧有 writeObject0 作为最后调用的方法.native 方法的名称以 0 结尾有一个命名约定.因此检查该方法是否确实是原生的.

The current stack frame has writeObject0 as the last called method. There is a naming convention that native method's names end with 0. Therefore check whether that method is indeed native.

如果是这样,它可能是用 C 编写的,这是一种古老的不安全语言,其程序往往会以不受控制的方式崩溃.这通常会导致 SIGSEGV.

If it is, it is probably written in C, an ancient unsafe language whose programs tend to crash in an uncontrolled way. This often leads to SIGSEGV.

在这种情况下,该方法是用 Java 编写的.

In this case, that method is written in Java though.

正如您在错误消息中所说的,请阅读 hs_err_pid42623.log 了解更多详细信息.在该文件中,您将找到有关崩溃代码的寄存器和一些机器指令.

As you were told in the error message, read hs_err_pid42623.log for further details. In that file you will find the registers and a few machine instructions around the code that crashed.

相关文章