如何将 etrace 与动态库一起使用以按时间顺序跟踪 C++ 中的函数调用?

2022-01-11 00:00:00 profiling c perl macros c++

背景:

我有一个大型模拟工具,我需要了解它的逻辑行为.为了做到这一点,如果我有函数调用的时间顺序,我会得到最大的帮助,作为一个最小的工作示例.

我在网上找到了很多工具,例如 CygProfiler 和 etrace.在找到解决方案时我变得非常痛苦,以至于我开始遵循最疯狂的解决方案,即在调试器中使用step into".如果您有一个小程序但没有完整的模拟工具,这是一个不错的选择.

<小时>

问题:

我面临的一个问题是上述解决方案最初是针对C的,并且在编译时会生成一个静态文件(*.o).另一方面,仿真工具生成一个共享库(.so).我对较低级别的东西知之甚少,所以当我尝试链接它们时似乎失败了.

我专门查看了 etrace 文档,它说:

<块引用>

要了解如何修改 ptrace.c 以使用动态库,请查看example2 目录.那里的来源还创建了一个独立的可执行文件,但 PTRACE_REFERENCE_FUNCTION 宏被定义为这将是一个动态库.

如果您查看 repo,则 exampleexample2 文件夹中的文件没有区别.只有example2中多了一个.h文件.

另一方面,如果您查看 src/ptrace.c ,它会说:

<块引用>

在动态库上使用 ptrace 时,必须设置PTRACE_REFERENCE_FUNCTION 宏为函数名图书馆.加载时这个函数的地址将是第一个行输出到跟踪文件,并允许翻译其他指向其符号名称的入口和出口指针.你可以设置带有任何#include 指令的宏 PTRACE_INCLUDE该函数可以访问此源文件.

下面是注释代码:

/* 在动态库上使用 ptrace 时,必须定义以下内容:#include "PTRACE_REFERENCE_FUNCTION 所需的任何文件"#define PTRACE_REFERENCE_FUNCTION 函数名`*/

<小时>

问题:

本质上问题如下:如何将 etrace 与动态库一起使用?

我需要#include 任何文件吗?

<块引用>

要跟踪独立程序,无需#include any附加文件.只需将您的代码与 ptrace.c 链接并使用-finstrument-functions 选项作为 gcc 的编译选项.应该这样做.

如何将通过 makefile 构建的 C++ 代码链接到 ptrace.c

最后说明:如果有人能容忍我的无知并为我的问题提供分步解决方案,我将不胜感激.

<小时>

更新 1:

我设法将与etrace相关的库添加到模拟工具中,并且运行良好.

但是,(可能是因为脚本太旧,或者不适合与 C++ 一起使用)使用 perl 脚本 默认由 etrace

提供<块引用>

十六进制数>0xffffffff 非便携"

可能这会稍微改变这个问题的性质,在这一点上将其更多地转变为与 perl 相关的问题.

如果这个问题解决了,我希望 etrace 可以处理一个复杂的项目,我会提供细节

<小时>

更新 2:

我接受了@Harry 的建议,我相信这在大多数项目中都适用.但是在我的情况下,我从 perl 脚本中得到以下信息:

在 etrace2.pl 第 99 行的列表赋值中使用 %SYMBOLTABLE 中的未初始化值,<CALL_DATA>1号线.-- ???|-- ???-- ???|-- ???||-- ???-- ???|-- ???-- ???|-- ???-- ???|-- ???-- ???|-- ???-- ???|-- ???

由于自动生成的 makefile,我使用 LD_PRELOAD 来加载 etrace.so 的共享库,如下所示:

gcc -g -finstrument-functions -shared -fPIC ptrace.c -o etrace.so -I 

我在工具中创建了虚拟 etrace.h:

#ifndef __ETRACE_H_#define __ETRACE_H_#include <stdio.h>void Crumble_buy(char * what, int quantity, char * unit);void Crumble_buy(char * what, int 数量, char * unit){printf("购买 %d %s of %s
", 数量, 单位, 什么);}#万一

#define 使用 Crumble_buy#include 使用 etrace.h.p>

解决方案

修复 Perl 脚本

<块引用>

十六进制数 > 0xffffffff 不可移植"

这是来自 hex 的警告,因为它正在检测可能不可移植的值(大于 32 位).

在脚本的最顶部,添加以下内容:

使用 bigint qw/hex oct/;

编写此工具时,我怀疑人们使用的是 32 位机器.您可以使用带有 -m32 标志的 32 位编译程序,但如果您按照上述方式更改 perl 脚本,则不需要这样做.

注意,如果您使用的是 Mac,则不能像在脚本中使用 mknod 那样来创建管道;您需要使用不带参数的 mkfifo .

在 Linux 上,添加上述 bigint 修复程序有效.然后你需要从同一个目录运行这两个命令,我使用 example2:

../src/etrace.pl 崩溃# 切换到不同的终端./崩溃

我在 Mac 和 Linux 上得到了这个

-- 主要|-- crumble_make_apple_crumble||-- Crumble_buy_stuff|||-- 崩溃购买|||-- 崩溃购买|||-- 崩溃购买|||-- 崩溃购买|||-- 崩溃购买||-- Crumble_prepare_apples|||-- Crumble_skin_and_dice||-- 崩溃混合||-- 崩溃终结|||-- 崩溃放|||-- 崩溃放||-- 崩溃厨师|||-- 崩溃放|||-- Crumble_bake

关于动态库...

加载动态库时,目标文件中的地址不是运行时将使用的地址.etrace 所做的是从您指定的标头中获取函数名称.例如,对于 example2,这将是以下内容:

#include "crumble.h"#define PTRACE_REFERENCE_FUNCTION Crumble_buy

然后您将编辑 makefile 以确保可以找到头文件:

CFLAGS = -g -finstrument-functions -I.

注意添加的include -I..来自标头的符号地址(在我们的例子中,Crumble_buy)用于计算目标文件和实际地址之间的偏移量;这允许程序计算正确的地址以找到符号.

如果您查看 nm 的输出,您会得到如下内容:

0000000100000960 T _Crumble_bake00000001000005b0 T _Crumble_buy0000000100000640 T _Crumble_buy_stuff00000001000009f0 T _Crumble_cook

左边的地址是相对的,也就是说,在运行时,这些地址实际上是变化的.etrace.pl 程序将这些存储在这样的哈希中:

$VAR1 = {'4294969696' =>'_Crumble_bake','4294969424' =>'_Crumble_put','4294970096' =>'_主要的','4294969264' =>'_Crumble_mix','4294970704' =>'_gnu_ptrace_close','4294967296' =>'__mh_execute_header','4294968752' =>'_Crumble_buy','4294968896' =>'_Crumble_buy_stuff','4294969952' =>'_Crumble_make_apple_crumble','4294969184' =>'_Crumble_prepare_apples','4294971512' =>'___GNU_PTRACE_FILE__','4294971504' =>'_gnu_ptrace.first','4294970208' =>'_gnu_ptrace','4294970656' =>'___cyg_profile_func_exit','4294970608' =>'___cyg_profile_func_enter','4294969552' =>'_Crumble_finalize','4294971508' =>'_gnu_ptrace.active','4294969840' =>'_Crumble_cook','4294969088' =>'_Crumble_skin_and_dice','4294970352' =>'_gnu_ptrace_init'};

注意前导下划线,因为这是在使用 clang 的 Mac 上.在运行时,这些地址不正确,但它们的相对偏移量是正确的.如果您可以计算出偏移量是多少,则可以调整在运行时获得的地址以找到实际符号.执行此操作的代码如下:

 if ($offsetLine =~ m/^$REFERENCE_OFFSETs+($SYMBOL_NAME)s+($HEX_NUMBER)$/) {# 这是一个动态库;需要计算负载偏移我的 $offsetSymbol = "_$1";我的 $offsetAddress = 十六进制 $2;我的 %offsetTable = 反向 %SYMBOLTABLE;打印转储器(\%offsetTable);$baseAddress = $offsetTable{$offsetSymbol} - $offsetAddress;#print("offsetSymbol == $offsetSymbol
");#print("offsetAddress == $offsetAddress
");#print("baseoffsetAddress == $offsetAddress
");$offsetLine = <CALL_DATA>;} 别的 {# 这是静态的$baseAddress = 0;}

这就是 #define PTRACE_REFERENCE_FUNCTION Crumble_buy 行的用途.ptrace 中的 C 代码正在使用该 MACRO,如果已定义,则首先输出该函数的地址.然后它计算偏移量,并为所有后续地址调整此数量,在哈希中查找正确的符号.

Background:

I have one big simulation tool, and I need to understand its logical behavior. In order to do that, the most of help I would get if I have the chronological order of function calls, for a minimal working example.

I found many tools online, like CygProfiler and etrace. I became so miserable on finding a solution that I started to follow the craziest solution of using "step into" with the debugger. Which is a good option if you have a small program but not a complete simulation tool.


Problem:

One of the problems I face is that the above-mentioned solutions are originally meant for C and they generate a static file (*.o) when compiled. On the other hand the simulation tool generates a shared library (.so). I don't have much knowledge on lower level stuff so I seem to fail when I try linking them.

I looked specifically at the etrace documentation, and it says:

To see how to modify ptrace.c to work with a dynamic library, look at the example2 directory. The sources there also create a stand-alone executable, but the PTRACE_REFERENCE_FUNCTION macro is defined just as it would be for a dynamic library.

If you look at the repo there is no difference between the files in example and example2 folders. Only there is an extra .h file in example2.

On the other hand, if you look at src/ptrace.c there it says:

When using ptrace on a dynamic library, you must set the PTRACE_REFERENCE_FUNCTION macro to be the name of a function in the library. The address of this function when loaded will be the first line output to the trace file and will permit the translation of the other entry and exit pointers to their symbolic names. You may set the macro PTRACE_INCLUDE with any #include directives needed for that function to be accesible to this source file.

a little below there is the commented code:

/* When using ptrace on a dynamic library, the following must be defined:
#include "any files needed for PTRACE_REFERENCE_FUNCTION"
#define PTRACE_REFERENCE_FUNCTION functionName
`*/


Question:

In essence the question is the following: How to use etrace with a dynamic library?

Do I need to #include any files?

To trace a stand-alone program, there is no need to #include any additional file. Just link your code against ptrace.c and use the -finstrument-functions option as a compile option for gcc. This should do it.

How do I link a C++ code which is built via makefiles against ptrace.c

Final Note: I would appreciate if someone bears with my ignorance and provides a step-by-step solution to my question.


Update 1:

I managed to add the libraries related to etrace to the simulation tool, and it executes fine.

However, (probably because the scripts are too old, or are not meant for use with C++) I get the following error when using the perl script provided by default by etrace

Hexadecimal number > 0xffffffff non-portable"

Probably this changes a bit the nature of this question, turning it more to a perl related issue at this point.

If this problem is solved, I hope etrace will work with a complicated project and I will provide the details


Update 2:

I took the suggestions from @Harry, and I believe that would work in most projects. However in my case I get the following from the perl script:

Use of uninitialized value within %SYMBOLTABLE in list assignment at etrace2.pl line 99, <CALL_DATA> line 1.

-- ???
|   -- ???
-- ???
|   -- ???
|   |   -- ???
-- ???
|   -- ???
-- ???
|   -- ???
-- ???
|   -- ???
-- ???
|   -- ???
-- ???
|   -- ???

Due to autegenerated makefiles I used the LD_PRELOAD to load the shared library for etrace.so which I got as follows:

gcc -g -finstrument-functions -shared -fPIC ptrace.c -o etrace.so -I <path-to-etrace>

I created the dummy etrace.h inside the tool:

#ifndef __ETRACE_H_
#define __ETRACE_H_

#include <stdio.h>

void Crumble_buy(char * what, int quantity, char * unit);


void Crumble_buy(char * what, int quantity, char * unit)
{
    printf("buy %d %s of %s
", quantity, unit, what);
}

#endif

and used Crumble_buy for the #define and the etrace.h for the #include.

解决方案

Fixing the Perl Script

Hexadecimal number > 0xffffffff non-portable"

This is a warning from hex because it's detecting a possibly non-portable value (something > 32bits).

At the very top of the script, add this:

use bigint qw/hex oct/;

When this tool was written, I suspect the people were on 32-bit machines. You can compile the program using 32-bit with the flag -m32, but if you change the perl script as mentioned above you won't need to.

Note, if you're on a Mac, you can't use mknod the way it's used in the script to create a pipe; you need to use mkfifo with no arguments instead.

On Linux, adding the bigint fix above works. You then need to run both commands from the same directory, I did this using example2:

../src/etrace.pl crumble
# Switch to a different terminal
./crumble

and I get this on the Mac and Linux

-- main
|   -- Crumble_make_apple_crumble
|   |   -- Crumble_buy_stuff
|   |   |   -- Crumble_buy
|   |   |   -- Crumble_buy
|   |   |   -- Crumble_buy
|   |   |   -- Crumble_buy
|   |   |   -- Crumble_buy
|   |   -- Crumble_prepare_apples
|   |   |   -- Crumble_skin_and_dice
|   |   -- Crumble_mix
|   |   -- Crumble_finalize
|   |   |   -- Crumble_put
|   |   |   -- Crumble_put
|   |   -- Crumble_cook
|   |   |   -- Crumble_put
|   |   |   -- Crumble_bake

About the Dynamic Library...

When you load a dynamic library, the address in the object file is not the address that will be used when running. What etrace does is take a function name from a header you specify. For example, in the case of example2, this would be the following:

#include "crumble.h"
#define PTRACE_REFERENCE_FUNCTION Crumble_buy

You would then edit the makefile to make sure that the header file can be found:

CFLAGS = -g -finstrument-functions -I.

Note the addition of the include -I.. The address of the symbol from the header (in our case, Crumble_buy) is used to calculate the offset between the object file and the actual address; this allows the program to calculate the correct address to find the symbol.

If you look at the output of nm, you get something like the following:

0000000100000960 T _Crumble_bake
00000001000005b0 T _Crumble_buy
0000000100000640 T _Crumble_buy_stuff
00000001000009f0 T _Crumble_cook

The addresses on the left are relative, that is, at runtime, these addresses actually change. The etrace.pl program is storing these in a hash like this:

$VAR1 = {
          '4294969696' => '_Crumble_bake',
          '4294969424' => '_Crumble_put',
          '4294970096' => '_main',
          '4294969264' => '_Crumble_mix',
          '4294970704' => '_gnu_ptrace_close',
          '4294967296' => '__mh_execute_header',
          '4294968752' => '_Crumble_buy',
          '4294968896' => '_Crumble_buy_stuff',
          '4294969952' => '_Crumble_make_apple_crumble',
          '4294969184' => '_Crumble_prepare_apples',
          '4294971512' => '___GNU_PTRACE_FILE__',
          '4294971504' => '_gnu_ptrace.first',
          '4294970208' => '_gnu_ptrace',
          '4294970656' => '___cyg_profile_func_exit',
          '4294970608' => '___cyg_profile_func_enter',
          '4294969552' => '_Crumble_finalize',
          '4294971508' => '_gnu_ptrace.active',
          '4294969840' => '_Crumble_cook',
          '4294969088' => '_Crumble_skin_and_dice',
          '4294970352' => '_gnu_ptrace_init'
        };

Note the leading underscore because this is on a Mac using clang. At runtime, these addresses are not correct, but their relative offsets are. If you can work out what the offset is, you can adjust the addresses you get at runtime to find the actual symbol. The code that does this follows:

 if ($offsetLine =~ m/^$REFERENCE_OFFSETs+($SYMBOL_NAME)s+($HEX_NUMBER)$/) {
    # This is a dynamic library; need to calculate the load offset
    my $offsetSymbol  = "_$1";
    my $offsetAddress = hex $2; 

    my %offsetTable = reverse %SYMBOLTABLE;

    print Dumper(\%offsetTable);
    $baseAddress = $offsetTable{$offsetSymbol} - $offsetAddress;
    #print("offsetSymbol == $offsetSymbol
");
    #print("offsetAddress == $offsetAddress
");
    #print("baseoffsetAddress == $offsetAddress
");
    $offsetLine = <CALL_DATA>;
  } else {
    # This is static
    $baseAddress = 0;
  }

This is what the line #define PTRACE_REFERENCE_FUNCTION Crumble_buy is for. The C code in ptrace is using that MACRO, and if defined, outputting the address of that function as the first thing. It then calculates the offset, and for all subsequent addresses, adjusts them by this amount, looking up the correct symbol in the hash.

相关文章