构造函数符号的双重发射
今天,我发现了一个关于 g++
或 nm
的相当有趣的事情......构造函数定义似乎在库中有两个条目.
Today, I discovered a rather interesting thing about either g++
or nm
...constructor definitions appear to have two entries in libraries.
我有一个标题 thing.hpp
:
class Thing
{
Thing();
Thing(int x);
void foo();
};
和thing.cpp
:
#include "thing.hpp"
Thing::Thing()
{ }
Thing::Thing(int x)
{ }
void Thing::foo()
{ }
我编译这个:
g++ thing.cpp -c -o libthing.a
然后,我在其上运行 nm
:
Then, I run nm
on it:
%> nm -gC libthing.a
0000000000000030 T Thing::foo()
0000000000000022 T Thing::Thing(int)
000000000000000a T Thing::Thing()
0000000000000014 T Thing::Thing(int)
0000000000000000 T Thing::Thing()
U __gxx_personality_v0
如您所见,Thing
的两个构造函数都在生成的静态库中列出了两个条目.我的 g++
是 4.4.3,但同样的行为发生在 clang
中,所以这不仅仅是 gcc
的问题.
As you can see, both of the constructors for Thing
are listed with two entries in the generated static library. My g++
is 4.4.3, but the same behavior happens in clang
, so it isn't just a gcc
issue.
这不会导致任何明显的问题,但我想知道:
This doesn't cause any apparent problems, but I was wondering:
- 为什么定义的构造函数会列出两次?
- 为什么这不会导致符号 __ 的多重定义"问题?
EDIT:对于 Carl,不带 C
参数的输出:
EDIT: For Carl, the output without the C
argument:
%> nm -g libthing.a
0000000000000030 T _ZN5Thing3fooEv
0000000000000022 T _ZN5ThingC1Ei
000000000000000a T _ZN5ThingC1Ev
0000000000000014 T _ZN5ThingC2Ei
0000000000000000 T _ZN5ThingC2Ev
U __gxx_personality_v0
如你所见...同一个函数生成多个符号,这还是很奇怪的.
As you can see...the same function is generating multiple symbols, which is still quite curious.
当我们在做的时候,这里是生成的程序集的一部分:
And while we're at it, here is a section of generated assembly:
.globl _ZN5ThingC2Ev
.type _ZN5ThingC2Ev, @function
_ZN5ThingC2Ev:
.LFB1:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
.LFE1:
.size _ZN5ThingC2Ev, .-_ZN5ThingC2Ev
.align 2
.globl _ZN5ThingC1Ev
.type _ZN5ThingC1Ev, @function
_ZN5ThingC1Ev:
.LFB2:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
所以生成的代码是...嗯...相同.
So the generated code is...well...the same.
编辑:为了查看实际调用的构造函数,我将 Thing::foo()
更改为:
EDIT: To see what constructor actually gets called, I changed Thing::foo()
to this:
void Thing::foo()
{
Thing t;
}
生成的程序集为:
.globl _ZN5Thing3fooEv
.type _ZN5Thing3fooEv, @function
_ZN5Thing3fooEv:
.LFB550:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
subq $48, %rsp
movq %rdi, -40(%rbp)
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingC1Ev
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingD1Ev
leave
ret
.cfi_endproc
所以它调用了完整的对象构造函数.
So it is invoking the complete object constructor.
推荐答案
我们首先声明 遵循 GCC 安腾 C++ ABI.
根据 ABI,您的 Thing::foo()
的损坏名称很容易解析:
According to the ABI, the mangled name for your Thing::foo()
is easily parsed:
_Z | N | 5Thing | 3foo | E | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`
您可以类似地读取构造函数名称,如下所示.注意构造函数name"是如何生成的.没有给出,而是一个 C
子句:
You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C
clause:
_Z | N | 5Thing | C1 | E | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`
但是这个 C1
是什么?您的副本具有 C2
.这是什么意思?
But what's this C1
? Your duplicate has C2
. What does this mean?
好吧,这也很简单:
<ctor-dtor-name> ::= C1 # complete object constructor
::= C2 # base object constructor
::= C3 # complete object allocating constructor
::= D0 # deleting destructor
::= D1 # complete object destructor
::= D2 # base object destructor
等等,为什么这简单?这个类没有基础.为什么它有一个完整的对象构造函数"?和基础对象构造函数"每个?
Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?
这个问答暗示我这只是多态支持的副产品,尽管在这种情况下实际上并不需要.
This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.
请注意,c++filt
过去常常在其解构输出中包含此信息,但没有了.
Note that c++filt
used to include this information in its demangled output, but doesn't any more.
此论坛帖子问了同样的问题,唯一的回答并没有更好地回答它,除了暗示 GCC 可以 在不涉及多态时避免发出两个构造函数,并且这种行为应该是将来会改进.
This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.
这个新闻组帖子描述了一个问题由于这种双重发射,在构造函数中设置断点.再次声明,问题的根源在于对多态的支持.
This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.
事实上,这被列为 GCC已知问题":
通常有三种类型的构造函数(和析构函数).
G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and destructors).
- 完整的对象构造函数/析构函数.
- 基础对象构造函数/析构函数.
- 分配构造函数/解除分配析构函数.
前两个是不同的,当虚拟基类是涉及.
The first two are different, when virtual base classes are involved.
这些不同构造函数的含义似乎如下:
完整的对象构造函数".它还构造了虚拟基类.
The "complete object constructor". It additionally constructs virtual base classes.
基础对象构造函数".它创建对象本身,以及数据成员和非虚拟基类.
The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.
分配对象构造函数".它完成了完整的对象构造函数所做的一切,此外它还调用 operator new 来实际分配内存......但显然这并不常见.
The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.
如果你没有虚拟基类,[前两个]是完全相同的;在足够的优化级别上,GCC 实际上会别名两者的符号相同.
If you have no virtual base classes, [the first two] are are identical; GCC will, on sufficient optimization levels, actually alias the symbols to the same code for both.
相关文章