由`kill`生成的SIGSEGV是否特殊?
我知道当内核使用它来报告内存访问冲突时,不能忽略 SIGSEGV
.但是,如果我为 SIGSEGV
安装一个什么都不做的信号处理程序,然后另一个进程使用 kill
向我发送该信号,这是否与我使用正常"信号(如 SIGUSR1
)的行为相同?p>
I know that SIGSEGV
can't be ignored when the kernel uses it to report a memory access violation. But if I install a signal handler for SIGSEGV
that does nothing, and then another process uses kill
to send me that signal, will this behave the same as if I had used a "normal" signal (like SIGUSR1
) instead?
推荐答案
Grijesh Chauhan 的回答在技术上是正确的,但很难理解,所以我将写出我自己对基本相同点的阐述.带脚注.0
Grijesh Chauhan's answer is technically correct but difficult to understand, so I am going to write out my own exposition of basically the same points. With footnotes.0
Dima 询问当一个线程为 SIGSEGV
安装无操作处理程序,然后另一个线程使用 kill
1 生成 SIGSEGV
在那个线程上.一句话的答案是处理程序运行,什么都不做,然后控制返回到被中断线程内的正常流程.与内核生成 SIGSEGV
作为对实际内存访问冲突的响应不同,这种情况不会 触发 未定义行为2.但是,SIGSEGV
和它的朋友(SIGBUS
、SIGFPE
和 SIGILL
)的预期目的是为了内核告诉你的程序它做了一些令人发指的事情,必须有一个错误,无法继续正常执行,你想在你被杀之前清理一下吗?因此,将它们用于其他任何事情都是不明智的.有几个信号(SIGUSR1
、SIGUSR2
和 SIGRTMIN
到 SIGRTMAX
)供每个应用程序使用,但是它喜欢;您应该改用其中之一.
Dima asks what happens when one thread installs a do-nothing handler for SIGSEGV
and then another thread uses kill
1 to generate SIGSEGV
on that thread. The one-sentence answer is that the handler runs, does nothing, and then control returns to normal flow within the interrupted thread. Unlike when the kernel generates SIGSEGV
as a response to an actual memory access violation, this scenario does not trigger undefined behavior2. However, the intended purpose of SIGSEGV
and its friends (SIGBUS
, SIGFPE
, and SIGILL
) is for the kernel to tell your program that it has done something so heinous that there must be a bug, normal execution cannot continue, would you like to clean up a little before you get killed? It is therefore unwise to use them for anything else. There are several signals (SIGUSR1
, SIGUSR2
, and SIGRTMIN
through SIGRTMAX
) reserved for each application to use however it likes; you should use one of those instead.
对于更长的答案,我将从 POSIX 标准 3 小节 信号操作.首先,像任何其他信号一样,可以传递 SIGFPE
、SIGILL
、SIGSEGV
和 SIGBUS
这四个信号异步"――没有特定时间――因为某些代码使用系统调用(例如 kill
)来生成它们.当这种情况发生时,它们被视为与默认动作"恰好是异常终止程序"的任何其他信号完全相同;请注意,许多其他信号都具有此属性,包括保留供应用程序使用的信号.如果您的程序只需要担心在生成时接收 SIGFPE
、SIGILL
、SIGSEGV
和 SIGBUS
通过 kill
和朋友,它可以用它们做所有正常的事情:阻止它们,忽略它们,建立信号处理程序来做任何对异步信号处理程序有效的事情,4 通过 sigwait
或 signalfd
而不是普通的 异步系统陷阱类似的传递机制.
For the longer answer, I am going to crib from the POSIX standard,3 subsection Signal Actions. First, the four signals SIGFPE
, SIGILL
, SIGSEGV
, and SIGBUS
, like any other signal, can be delivered "asynchronously"―at no particular time―because some piece of code used a system call (such as kill
) to generate them. When this happens, they are treated exactly the same as any other signal whose default "action" happens to be "terminate the program abnormally"; note that many other signals have this property, including those reserved for application use. If your program only ever has to worry about receiving SIGFPE
, SIGILL
, SIGSEGV
, and SIGBUS
when they are generated by kill
and friends, it can do all of the normal things with them: block them, ignore them, establish signal handlers that do anything that is valid for an asynchronous signal handler,4 receive them via sigwait
or signalfd
rather than the normal asynchronous system trap-like delivery mechanism.
但是.SIGFPE
、SIGILL
、SIGSEGV
和 SIGBUS
也由内核同步"生成以响应不同的种类触发硬件异常的错误程序行为,例如尝试访问未映射的内存.同步意味着信号在执行有问题的 CPU 指令时立即传递,并且在同一个线程上而不是在进程中碰巧解除阻塞的任何线程上传递.我们真的不是在开玩笑立即"部分:如果有一个信号处理程序,当它执行时,保存的程序计数器将指向导致任何类型的硬件异常的确切指令.内核不允许通过导致 CPU 发出硬件异常的指令继续执行,因此 POSIX 表示尝试丢弃这些信号而不是终止进程或采取一些剧烈的恢复操作:
However. SIGFPE
, SIGILL
, SIGSEGV
, and SIGBUS
are also generated "synchronously" by the kernel in response to different kinds of erroneous program behavior that trigger hardware exceptions, such as attempting to access unmapped memory. Synchronously means that the signal is delivered immediately upon execution of the offending CPU instruction, and on that same thread rather than any thread in the process that happens to have it unblocked. And we're really not kidding about the "immediately" part: if there's a signal handler, when it executes, the saved program counter will be pointing at the exact instruction that caused whatever sort of hardware exceptions it was. The kernel can't allow execution to proceed through an instruction that causes the CPU to issue a hardware exception, so POSIX says this about attempting to discard these signals rather than terminate the process or take some drastic recovery action:
进程在忽略SIGFPE
、SIGILL
、SIGSEGV
或SIGBUS
后的行为未定义不是由 kill()
、sigqueue()
或 raise()
生成的信号.
The behavior of a process is undefined after it ignores a
SIGFPE
,SIGILL
,SIGSEGV
, orSIGBUS
signal that was not generated bykill()
,sigqueue()
, orraise()
.
("After it ignores" 在上下文中的意思是"如果内核尝试在操作设置为 SIG_IGN
时同步生成这些信号之一.)
("After it ignores" in context means "if the kernel tries to generate one of these signals synchronously when the action is set to SIG_IGN
.)
进程从信号捕获函数正常返回后的行为是不确定的/p>
The behavior of a process is undefined after it returns normally from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(), sigqueue(), or raise().
(正常返回"的意思是不是通过调用 (sig)longjmp
".它 是有效的,至少就内核而言,展开堆栈并在其他地方恢复执行;但是,如果您没有充分修复导致故障的损坏数据结构,您可能会遇到麻烦.正如 Basile 所提到的,弄乱已保存的处理器也是有效的状态,以便正常"返回不只是尝试再次运行相同的错误指令;但这样做往往涉及手动解释机器指令和其他此类黑魔法.)
("Returns normally" means "not by calling (sig)longjmp
". It is valid, at least as far as the kernel is concerned, to unwind the stack and resume execution somewhere else; you may be in for trouble if you haven't sufficiently fixed up the damaged data structure that caused the fault in the first place, though. It's also valid, as Basile mentioned, to mess with the saved processor state so that returning "normally" doesn't just try to run the same bad instruction again; but doing that tends to involve manually interpreting machine instructions and other such black magic.)
如果任何 SIGFPE、SIGILL、SIGSEGV 或 SIGBUS 信号在它们被阻塞时生成,则结果是不确定的,除非该信号是由另一个进程的操作或由函数 kill() 之一生成的,pthread_kill()、raise() 或 sigqueue().
If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS signals are generated while they are blocked, the result is undefined, unless the signal was generated by the action of another process, or by one of the functions kill(), pthread_kill(), raise(), or sigqueue().
(我不确定为什么这个措辞与其他两个有点不同;可能只是因为编写 sigprocmask
的特定文档没有与编写 信号操作的一般文档.)
(I'm not sure why this wording is a little different from the other two; probably just because the person who wrote the specific documentation for sigprocmask
didn't coordinate with the person who wrote the general documentation for signal actions.)
0 Dima 将他们的问题标记为Linux",但无论如何我还是要提出这个警告,以使未来的读者受益:我在这里写的所有内容都应该假设仅适用于 到符合 POSIX 的操作系统;首先,这意味着您可能会遇到的所有操作系统,Windows 除外."例外很重要.与 POSIX 相比,Windows 对线程和进程之间的关系有完全不同的概念,并且有一个完全不同的(优越的!)用于报告 CPU 生成的错误程序异常的基本机制.Windows 上的信号由 C 库模拟,并且可能不按我描述的那样运行.
1 大概实际上是 pthread_kill
.
2 请阅读此整个系列博文
3 特别是 The Open Group Base Specifications Issue 7, 2013 edition 的在线副本,它也同时"是 2013 年版的 IEEE 标准 1003.1,POSIX.
4 不多,但总比没有;Signal Actions"文档中有一个列表,但没有片段 ID 让我可以将您指向它.
0 Dima tagged their question "Linux", but I am going to put in this caveat anyway, for the benefit of future readers: Everything I write here should be assumed to apply only to POSIX-conformant operating systems; to first order, that means "all OSes you are likely to encounter, except Windows." The exception is important. Windows has a rather different notion of the relationship between threads and processes than POSIX does, and a completely different (superior!) fundamental mechanism for reporting CPU-generated erroneous-program exceptions. Signals on Windows are emulated by the C library and probably do not behave as I describe.
1 presumably actually pthread_kill
.
2 please read this entire series of blog posts
3 specifically, the online copy of The Open Group Base Specifications Issue 7, 2013 edition, which is also "simultaneously" the 2013 edition of IEEE Standard 1003.1, POSIX.
4 not a lot, but more than nothing; there's a list in the "Signal Actions" document, but no fragment ID allowing me to point you right at it.
相关文章