Skip to content
On this page

addr2line原理与使用


软件版本硬件版本更新内容

1. 概述

当我们程序crash之后,系统往往会帮我们做dump_stack的操作,也就可以看到当时调用栈的信息,那么如何从调用栈的信息反向找出对应的代码的那一行就需要使用addr2line.

2. 原理

我们知道ELF文件中除了ELF标准定义的那些标准section,我也是可以加入一些自定义的段来保存一些特殊的信息的,当我们使用gcc加上g选项编译生成一个ELF文件时,这个ELF中会存在一个 非标准 的section,也就是 .debug_info 这个段,这个段就存有addr2line所需要的信息。

.debug_info 这个段的信息是按 DWARF 定义的格式来存储的,当然信息解析就需要参考DWARF来解析,addr2line就是这样的一个按DWARF来将一个程序地址转化一个文件名:行号的程序。

例如我们存在如下一个简单的test.c程序:

c
#include <stdio.h>

int add(int a, int b) {
	return a + b;
}

int main(int argc, char *argv[])
{
	int a = 0;
	int b = 1;
	int c = 0;

	c = add(a, b);

	return 0;
}

然后通过gcc -g -o test ./test.c来编译成出一个ELF文件叫test

再通过readelf -w ./test,来dump它的dubug_info段,如下所示

c
...

The File Name Table (offset 0x9f):
  Entry Dir     Time    Size    Name
  1     1       0       0       test.c
  2     2       0       0       stddef.h
  3     3       0       0       types.h
  4     4       0       0       struct_FILE.h
  5     4       0       0       FILE.h
  6     5       0       0       stdio.h
  7     3       0       0       sys_errlist.h

Line Number Statements:
  [0x000000f8]  Set column to 23
  [0x000000fa]  Extended opcode 2: set Address to 0x1129
  [0x00000105]  Special opcode 7: advance Address by 0 to 0x1129 and Line by 2 to 3
  [0x00000106]  Set column to 11
  [0x00000108]  Special opcode 202: advance Address by 14 to 0x1137 and Line by 1 to 4
  [0x00000109]  Set column to 1
  [0x0000010b]  Special opcode 118: advance Address by 8 to 0x113f and Line by 1 to 5
  [0x0000010c]  Special opcode 36: advance Address by 2 to 0x1141 and Line by 3 to 8
  [0x0000010d]  Set column to 6
  [0x0000010f]  Advance PC by constant 17 to 0x1152
  [0x00000110]  Special opcode 34: advance Address by 2 to 0x1154 and Line by 1 to 9
  [0x00000111]  Special opcode 104: advance Address by 7 to 0x115b and Line by 1 to 10
  [0x00000112]  Special opcode 104: advance Address by 7 to 0x1162 and Line by 1 to 11
  [0x00000113]  Special opcode 105: advance Address by 7 to 0x1169 and Line by 2 to 13
  [0x00000114]  Set column to 9
  [0x00000116]  Advance PC by constant 17 to 0x117a
  [0x00000117]  Special opcode 21: advance Address by 1 to 0x117b and Line by 2 to 15
  [0x00000118]  Set column to 1
  [0x0000011a]  Special opcode 76: advance Address by 5 to 0x1180 and Line by 1 to 16
  [0x0000011b]  Advance PC by 2 to 0x1182
  [0x0000011d]  Extended opcode 1: End of Sequence

...

上面的信息就是记录了汇编指令和行号之间的对应关系,如[0x00000110] Special opcode 34: advance Address by 2 to 0x1154 and Line by 1 to 9,这行最后面的13其实就是源码test.c中的第9行,这行中的0x1154其实是一个汇编指令地址。

然后我们再通过objdump -d ./test来反汇编来如下内容:

c
...

0000000000001129 <add>:
    1129:       f3 0f 1e fa             endbr64
    112d:       55                      push   %rbp
    112e:       48 89 e5                mov    %rsp,%rbp
    1131:       89 7d fc                mov    %edi,-0x4(%rbp)
    1134:       89 75 f8                mov    %esi,-0x8(%rbp)
    1137:       8b 55 fc                mov    -0x4(%rbp),%edx
    113a:       8b 45 f8                mov    -0x8(%rbp),%eax
    113d:       01 d0                   add    %edx,%eax
    113f:       5d                      pop    %rbp
    1140:       c3                      retq

0000000000001141 <main>:
    1141:       f3 0f 1e fa             endbr64
    1145:       55                      push   %rbp
    1146:       48 89 e5                mov    %rsp,%rbp
    1149:       48 83 ec 20             sub    $0x20,%rsp
    114d:       89 7d ec                mov    %edi,-0x14(%rbp)
    1150:       48 89 75 e0             mov    %rsi,-0x20(%rbp)
    1154:       c7 45 f4 00 00 00 00    movl   $0x0,-0xc(%rbp)
    115b:       c7 45 f8 01 00 00 00    movl   $0x1,-0x8(%rbp)
    1162:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
    1169:       8b 55 f8                mov    -0x8(%rbp),%edx
    116c:       8b 45 f4                mov    -0xc(%rbp),%eax
    116f:       89 d6                   mov    %edx,%esi
    1171:       89 c7                   mov    %eax,%edi
    1173:       e8 b1 ff ff ff          callq  1129 <add>
    1178:       89 45 fc                mov    %eax,-0x4(%rbp)
    117b:       b8 00 00 00 00          mov    $0x0,%eax
    1180:       c9                      leaveq
    1181:       c3                      retq
    1182:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    1189:       00 00 00
    118c:       0f 1f 40 00             nopl   0x0(%rax)

...

可以看到1154就代码中第9行,定义一个变量。 以上就是addr2line工作的基本原理,详细的东西需要仔细研究 DWARF,个人感觉没有什么意义,知道大概原理就可以了。

3. addr2line的使用

addr2line使用非常简单,常用的就是addr2line -e <ELF文件> <address ...>,如果不使用**-e指定默认使用a.out**,另外地址怎么找,就是就是从ELF文件读取出sysbol再加上crash时调用栈中的offset就可以了,使用内核crash时会打印类似如下的stack信息:

c
root@raspi:~# echo c /proc/sysrq-trigger
c /proc/sysrq-trigger
root@raspi:~# echo c > /proc/sysrq-trigger
[31323.615771] sysrq: Trigger a crash
[31323.619359] Kernel panic - not syncing: sysrq triggered crash
[31323.625243] CPU: 3 PID: 725 Comm: bash Not tainted 5.8.18-g6c23a5884ae7 #1
[31323.632247] Hardware name: Raspberry Pi 4 Model B (DT)
[31323.637489] Call trace:
[31323.640007]  dump_backtrace+0x0/0x188
[31323.643756]  show_stack+0x28/0x38
[31323.647150]  __dump_stack+0x2c/0x3c
[31323.650718]  dump_stack+0x23c/0x2ec
[31323.654289]  panic+0x2e8/0x578
[31323.657415]  sysrq_handle_reboot+0x0/0x2c
[31323.661509]  __handle_sysrq+0xd8/0x1fc
[31323.665342]  write_sysrq_trigger+0xb0/0xc0
[31323.669528]  pde_write+0x54/0x68
[31323.672829]  proc_reg_write+0x8c/0xa8
[31323.676570]  vfs_write+0xf0/0x210
[31323.679957]  ksys_write+0x68/0xf0
[31323.683344]  __arm64_sys_write+0x1c/0x28
[31323.687356]  __invoke_syscall+0x20/0x2c
[31323.691277]  invoke_syscall+0x80/0xd0
[31323.695021]  el0_svc_common+0xbc/0x150
[31323.698853]  do_el0_svc+0x34/0x44
[31323.702249]  el0_svc+0x40/0x50
[31323.705378]  el0_sync_handler+0x134/0x200
[31323.709475]  el0_sync+0x158/0x180
[31323.712880] SMP: stopping secondary CPUs
[31323.716902] Kernel Offset: disabled
[31323.720466] CPU features: 0x240022,20006000
[31323.724731] Memory Limit: none
[31323.727877] ---[ end Kernel panic - not syncing: sysrq triggered crash ]---


每行中最的**+0xXX/0xXX**,"/"前的0xXX就是offset,后面的0xXX是当前函数的长度。然后可以从vmlinux中找到当前函数的起始地址,也可以从编译生成的System.map中找到函数的起始地址再加上偏移就可以了。最后通过addr2line就可以知道那个文件的那行的。

pde_write如下所示:

c
//在System.map中如下:
...
cat ./System.map | grep "pde_write"
ffff8000104e9a58 t pde_write
...
ffff8000104e9a58 + 0x54 = 0xffff8000104e9aac
//执行 aarch64-linux-gnu-addr2line -e vmlinux  0xffff8000104e9aac 输出如下
fs/proc/inode.c:330


提示

欢迎评论、探讨,如果发现错误请指正。转载请注明出处! 探索者


Released under the MIT License.