当我尝试发出目标代码时,为什么 LLVM 会出现段错误?

2022-01-12 00:00:00 segmentation-fault llvm c++ llvm-ir

我正在尝试遵循有关编译器实现的 LLVM 教程,但是当我尝试发出目标代码时,我的代码出现了段错误.

I'm trying to follow along with the LLVM tutorial on compiler implementation, but my code segfaults when I try to emit object code.

这是一个尝试编译函数 func 的最小示例.为了简单起见,func 是一个什么都不做的函数.

Here's a minimal example that attempts to compile a function func. To keep things simple, func is a function that does nothing.

#include <iostream>
#include <llvm/ADT/Optional.h>
#include <llvm/IR/BasicBlock.h>
#include <llvm/IR/DerivedTypes.h>
#include <llvm/IR/Function.h>
#include <llvm/IR/IRBuilder.h>
#include <llvm/IR/LLVMContext.h>
#include <llvm/IR/LegacyPassManager.h>
#include <llvm/IR/Module.h>
#include <llvm/IR/Type.h>
#include <llvm/IR/Verifier.h>
#include <llvm/Support/CodeGen.h>
#include <llvm/Support/FileSystem.h>
#include <llvm/Support/Host.h>
#include <llvm/Support/TargetRegistry.h>
#include <llvm/Support/TargetSelect.h>
#include <llvm/Support/raw_ostream.h>
#include <llvm/Target/TargetMachine.h>
#include <llvm/Target/TargetOptions.h>
#include <stdexcept>
#include <string>
#include <system_error>
#include <vector>

int main() {

    llvm::LLVMContext context;
    llvm::IRBuilder<> builder(context);
    llvm::Module      module("module", context);

    llvm::Function* const func = llvm::Function::Create(
        llvm::FunctionType::get(llvm::Type::getVoidTy(context),
                                std::vector<llvm::Type*>(), false),
        llvm::Function::ExternalLinkage, "func", &module
    );

    builder.SetInsertPoint(llvm::BasicBlock::Create(context, "entry", func));

    llvm::verifyFunction(*func);

    func->dump();

    llvm::InitializeAllTargetInfos();
    llvm::InitializeAllTargets();
    llvm::InitializeAllTargetMCs();
    llvm::InitializeAllAsmParsers();
    llvm::InitializeAllAsmPrinters();

    const std::string triple = llvm::sys::getDefaultTargetTriple();

    std::string message;
    const llvm::Target* const target = llvm::TargetRegistry::lookupTarget(
        triple, message
    );
    if (!target) throw std::runtime_error("Couldn't find target.");

    llvm::TargetMachine* const machine = target->createTargetMachine(
        triple, "generic", "", llvm::TargetOptions(),
        llvm::Optional<llvm::Reloc::Model>()
    );

    module.setDataLayout(machine->createDataLayout());
    module.setTargetTriple(triple);

    std::error_code code;
    llvm::raw_fd_ostream obj_file("func.o", code, llvm::sys::fs::F_None);
    if (code) throw std::runtime_error("Couldn't open object file.");

    llvm::legacy::PassManager manager;
    if (
        machine->addPassesToEmitFile(manager, obj_file,
                                     llvm::TargetMachine::CGFT_ObjectFile)
    ) throw std::runtime_error("Adding passes failed.");

    std::cout << "Running pass manager." << std::endl;
    manager.run(module);
    std::cout << "Ran pass manager." << std::endl;

    obj_file.flush();

}

这是我正在编译的命令.我使用的是 GCC 6.3.1 版和 LLVM 3.9.1 版.

Here's the command I'm compiling with. I'm using GCC version 6.3.1 and LLVM version 3.9.1.

g++ src/main.cc -o bin/test -std=c++1z -Wall -Wextra             
    -Wno-unused-function -Wno-unused-value -Wno-unused-parameter 
    -Werror -ggdb -O0 `llvm-config --system-libs --libs core`

这是输出.

define void @func() {
entry:
}

Running pass manager.
Segmentation fault (core dumped)

向 IR 的转换成功――至少在我看来,转储是正确的――但在调用 llvm::legacy::PassManager::run 时会发生段错误.

The translation to IR succeeds--at least to me the dump looks correct--but the segfault occurs on calling llvm::legacy::PassManager::run.

我尝试使用 GDB 单步执行代码.这是从段错误那一刻起的回溯.

I tried stepping through the code with GDB. Here's the backtrace from the moment of the segfault.

#0  0x00007ffff56ce72f in ?? () from /usr/lib/libLLVM-3.9.so
#1  0x00007ffff56477c2 in llvm::FPPassManager::runOnFunction(llvm::Function&) () from /usr/lib/libLLVM-3.9.so
#2  0x00007ffff5647b4b in llvm::FPPassManager::runOnModule(llvm::Module&) () from /usr/lib/libLLVM-3.9.so
#3  0x00007ffff5647e74 in llvm::legacy::PassManagerImpl::run(llvm::Module&) () from /usr/lib/libLLVM-3.9.so
#4  0x0000000000403ab6 in main () at src/main.cc:76

不幸的是,我的 LLVM 安装(在 Arch Linux 上使用 pacman 安装)似乎没有行号调试信息,所以我无法确切知道 llvm::FPPassManager::runOnFunction的执行出现问题.

Unfortunately, my LLVM installation (installed using pacman on Arch Linux) doesn't seem to have line-number debugging information, so I can't tell exactly where in llvm::FPPassManager::runOnFunction's execution the problem is occurring.

对于我正在尝试做的事情,无论是在概念上还是在实施上,有什么明显的错误吗?

Is there anything obviously wrong, either in concept or in implementation, with what I'm trying to do?

推荐答案

必须终止所有 LLVM 基本块(参见例如 http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#details).在您的情况下,生成的 IR 应如下所示:

All LLVM basic blocks must be terminated (see e.g. http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#details). In your case, the generated IR should look like this:

define void @func() {
entry:
    ret void
}

在您的 C++ 代码中,您需要在调用 llvm::verifyFunction 之前添加 builder.CreateRetVoid().

In your C++ code, you need to add builder.CreateRetVoid() before you call llvm::verifyFunction.

另外,llvm::verifyFunction 没有明显地输出错误,因为您没有传递第二个参数,该参数指示 LLVM 应该输出错误的流.试试这个,而不是输出到标准错误:

Also, llvm::verifyFunction is not visibly outputting an error because you haven't passed the second parameter which indicates the stream to which LLVM should output errors. Try this instead to output to stderr:

llvm::verifyFunction(*func, &llvm::errs())

您还应该检查 llvm::verifyFunction 的返回值.true 返回值表示错误.

You also should check the return value of llvm::verifyFunction. A true return value indicates an error.

参见:http://llvm.org/docs/doxygen/html/namespacellvm.html#a26389c546573f058ad8ecbdc5c1933cf 和 http://llvm.org/docs/doxygen/html/raw__ostream_8h.html

您还应该考虑在生成目标文件之前通过调用 llvm::verifyModule(theModule, theOsStream) 验证整个模块(参见 http://llvm.org/docs/doxygen/html/Verifier_8h.html).

You should also consider verifying the entire module before generating object files by calling llvm::verifyModule(theModule, theOsStream) (see http://llvm.org/docs/doxygen/html/Verifier_8h.html).

最后,我建议在编译 C 代码时检查 Clang 生成的 IR,以便检查正确生成的 IR 的样子.例如,您可以创建一个简单的 C 文件,如下所示:

Finally, I'd recommend inspecting the IR generated by Clang when compiling C code so that you can inspect what correctly generated IR looks like. For example, you can create a simple C file as follows:

// test.c
void func(void) {}

然后编译查看如下:

clang -S -emit-llvm test.c
cat test.ll

给予:

define dso_local void @_Z4funcv() #0 !dbg !7 {
  ret void, !dbg !11
}

attributes #0 = { noinline nounwind optnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

相关文章