C 和汇编程序实际上编译成什么?
所以我发现 C(++) 程序实际上并不能编译为简单的二进制"(我可能在这里弄错了一些东西,在那种情况下我很抱歉 :D)但是可以编译成一系列的东西(符号表,操作系统相关的东西,...)但是...
So I found out that C(++) programs actually don't compile to plain "binary" (I may have gotten some things wrong here, in that case I'm sorry :D) but to a range of things (symbol table, os-related stuff,...) but...
汇编器是否编译"为纯二进制?这意味着除了预定义字符串等资源之外没有额外的东西.
Does assembler "compile" to pure binary? That means no extra stuff besides resources like predefined strings, etc.
如果 C 编译成其他东西而不是普通的二进制文件,那么这个小型汇编引导加载程序如何将指令从 HDD 复制到内存并执行它们?我的意思是,如果操作系统内核(可能是用 C 语言编写的)编译为与普通二进制不同的东西 - 引导加载程序如何处理它??/p>
If C compiles to something else than plain binary, how can that small assembler bootloader just copy the instructions from the HDD to memory and execute them? I mean if the OS kernel, which is probably written in C, compiles to something different than plain binary - how does the bootloader handle it?
我知道汇编器不会编译",因为它只有您机器的指令集 - 我没有找到关于汇编器汇编"到什么的好词.如果你有的话,把它留在这里作为评论,我会改变它.
edit: I know that assembler doesn't "compile" because it only has your machine's instruction set - I didn't find a good word for what assembler "assembles" to. If you have one, leave it here as comment and I'll change it.
推荐答案
C 通常编译为汇编程序,只是因为这让可怜的编译器编写者的生活变得轻松.
C typically compiles to assembler, just because that makes life easy for the poor compiler writer.
汇编代码总是汇编(而不是编译")为可重定位的目标代码.您可以将其视为二进制机器代码和二进制数据,但有很多装饰和元数据.关键部分是:
Assembly code always assembles (not "compiles") to relocatable object code. You can think of this as binary machine code and binary data, but with lots of decoration and metadata. The key parts are:
代码和数据出现在命名的部分"中.
Code and data appear in named "sections".
可重定位目标文件可能包括标签的定义,这些标签指的是各部分中的位置.
Relocatable object files may include definitions of labels, which refer to locations within the sections.
可重定位的目标文件可能包含漏洞",这些漏洞"将被其他地方定义的标签值填充.这种洞的正式名称是重定位条目.
Relocatable object files may include "holes" that are to be filled with the values of labels defined elsewhere. The official name for such a hole is a relocation entry.
例如,如果你编译和汇编(但不链接)这个程序
For example, if you compile and assemble (but don't link) this program
int main () { printf("Hello, world
"); }
你很可能会得到一个可重定位的目标文件
you are likely to wind up with a relocatable object file with
text
部分,包含main
main
的标签定义,指向文本部分的开头
A label definition for main
which points to the beginning of the text section
一个 rodata
(只读数据)部分,包含字符串文字 "Hello, world
"
A rodata
(read-only data) section containing the bytes of the string literal "Hello, world
"
依赖于 printf
的重定位条目,指向文本部分中间调用指令中的洞".
A relocation entry that depends on printf
and that points to a "hole" in a call instruction in the middle of a text section.
如果您在 Unix 系统上,可重定位的目标文件通常称为 .o 文件,如 hello.o
,您可以使用名为 hello.o
的简单工具探索标签定义和使用code>nm,你可以从一个叫做 objdump
的更复杂的工具中获得更详细的信息.
If you are on a Unix system a relocatable object file is generally called a .o file, as in hello.o
, and you can explore the label definitions and uses with a simple tool called nm
, and you can get more detailed information from a somewhat more complicated tool called objdump
.
我教了一门涵盖这些主题的课程,我让学生编写汇编器和链接器,这需要几周时间,但是当他们完成后,他们中的大多数人都对可重定位目标代码有很好的处理能力.这不是一件容易的事.
I teach a class that covers these topics, and I have students write an assembler and linker, which takes a couple of weeks, but when they've done that most of them have a pretty good handle on relocatable object code. It's not such an easy thing.
相关文章