第六章运行时的检查与调试

调试器是一种可以检查正在执行的程序的程序。调试器可以启动并执行一个程序，然后在某个特定的行停止，方便我们检查此时此刻程序的状态。调试暂停（但没有结束）的位置被称为断点。

我们将使用GDB——GNU调试器来调试我们的内核。gdb是程序的名字，可以完成下面四个主要工作：

启动你的程序，并指明任何会影响其行为的东西。
使你的程序在指定的条件下暂停。
当程序终止时，可以检查到底发生了些什么。
改变程序的状态，你可以通过试验尝试纠正某个错误的影响，接着再了解下一个。

6.1 一个示例程序

必须要有一个现成的程序才能进行调试。对于这一章来说，我们的老朋友"Hello World"就够用了：

#include <stdio.h>

int main(int argc, char *argv[]) {
    printf("Hello World!\n");
    return 0;
}

编译时，我们用选项-g把调试信息也保存下来。

We compile it with debugging information with the option -g:

$ gcc -m32 -g hello.c -o hello

最后，我们启动gdb，并且把程序作为启动参数：

$ gdb hello

6.2 对一个程序进行静态检查

在检查一个运行中的程序之前，gdb首先会加载它。加载到内存之后（即使不运行），已经可以获取到很多有用的信息以供检查。本节中的命令可以在程序运行前使用。然而，它们也可以在程序运行时使用，并且可以显示更多的信息。

6.2.1 命令：`info target`/`info file`/`info files`

下面的命令可以打印出正在调试的目标的信息。目标就是正在调试的程序。

示例6.2.1 hello程序的命令输出，详尽地展示了一个本地的目标。

(gdb) info target

Symbols from "/tmp/hello".
Local exec file:
  `/tmp/hello', file type elf32-i386.
  Entry point: 0x8048310
  0x08048154 - 0x08048167 is .interp
  0x08048168 - 0x08048188 is .note.ABI-tag
  0x08048188 - 0x080481ac is .note.gnu.build-id
  0x080481ac - 0x080481cc is .gnu.hash
  0x080481cc - 0x0804821c is .dynsym
  0x0804821c - 0x08048266 is .dynstr
  0x08048266 - 0x08048270 is .gnu.version
  0x08048270 - 0x08048290 is .gnu.version_r
  0x08048290 - 0x08048298 is .rel.dyn
  0x08048298 - 0x080482a8 is .rel.plt
  0x080482a8 - 0x080482cb is .init
  0x080482d0 - 0x08048300 is .plt
  0x08048300 - 0x08048308 is .plt.got
  0x08048310 - 0x080484a2 is .text
  0x080484a4 - 0x080484b8 is .fini
  0x080484b8 - 0x080484cd is .rodata
  0x080484d0 - 0x080484fc is .eh_frame_hdr
  0x080484fc - 0x080485c8 is .eh_frame
  0x08049f08 - 0x08049f0c is .init_array
  0x08049f0c - 0x08049f10 is .fini_array
  0x08049f10 - 0x08049f14 is .jcr
  0x08049f14 - 0x08049ffc is .dynamic
  0x08049ffc - 0x0804a000 is .got
  0x0804a000 - 0x0804a014 is .got.plt
  0x0804a014 - 0x0804a01c is .data
  0x0804a01c - 0x0804a020 is .bss

输出显示了各种报告：

符号文件的路径。符号文件是包含着调试信息的文件。通常，它与二进制文件是同一个文件，但是更多情况是把可执行的二进制文件以及它的调试信息分成两个文件，尤其是在远程调试的时候。示例中，它是这一行：

Symbols from "/tmp/hello".

调试程序的路径以及它的文件类型。示例中，它是这一行：

Local exec file:
  `/tmp/hello', file type elf32-i386.

调试程序的入口。也就是程序运行的第一段代码。示例中，它是这一行：

Entry point: 0x8048310

带有起始和结束地址的节表。示例中，它是报告的剩余部分。

示例6.2.1 如果调试的程序运行在另外一台设备上，它就是一个远程目标，gdb只会打印一些简要的信息：

(gdb) info target

Remote serial target in gdb-specific protocol:
Debugging a target over a serial line.

6.2.2 命令：`maint info sections`

这个命令与info target类似，但给出了关于程序各节的额外信息，特别是每个节的文件偏移量以及各个标志。

下面是针对hello程序运行时的输出。

(gdb) maint info sections

Exec file:
    `/tmp/hello', file type elf64-x86-64.
  [0]     0x00400238->0x00400254 at 0x00000238: .interp ALLOC LOAD READONLY DATA HAS_CONTENTS
  [1]     0x00400254->0x00400274 at 0x00000254: .note.ABI-tag ALLOC LOAD READONLY DATA HAS_CONTENTS
  [2]     0x00400274->0x00400298 at 0x00000274: .note.gnu.build-id ALLOC LOAD READONLY DATA HAS_CONTENTS
  [3]     0x00400298->0x004002b4 at 0x00000298: .gnu.hash ALLOC LOAD READONLY DATA HAS_CONTENTS
  [4]     0x004002b8->0x00400318 at 0x000002b8: .dynsym ALLOC LOAD READONLY DATA HAS_CONTENTS
  [5]     0x00400318->0x00400355 at 0x00000318: .dynstr ALLOC LOAD READONLY DATA HAS_CONTENTS
  [6]     0x00400356->0x0040035e at 0x00000356: .gnu.version ALLOC LOAD READONLY DATA HAS_CONTENTS
  [7]     0x00400360->0x00400380 at 0x00000360: .gnu.version_r ALLOC LOAD READONLY DATA HAS_CONTENTS
....remaining output omitted....

这个输出与info target的输出类似，但是有更多的细节。在节的名字旁边是节的标志，它们是节的属性。在这里，我们可以看到带有LOAD标志的部分来自LOAD段。这个命令可以结合节的标志，对输出进行过滤：

ALLOBJ 显示所有加载的对象文件的节，包括共享库。共享库只有在程序已经运行时才会显示出来。

section names 只显示命名了的节。

示例6.2.4 命令：

(gdb) maint info sections .text .data .bss

只显示.text、.data和.bss节：

Exec file:
    `/tmp/hello', file type elf64-x86-64.
  [13]     0x00400430->0x004005c2 at 0x00000430: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
  [24]     0x00601028->0x00601038 at 0x00001028: .data ALLOC LOAD DATA HAS_CONTENTS
  [25]     0x00601038->0x00601040 at 0x00001038: .bss ALLOC

section-flags 只显示具有指定标志的节。注意，尽管这些标志是基于节属性定义的，他们都是gdb特有的。目前，gdb理解下列标志：

ALLOC 节会在程序加载时在进程中分配获得空间。除了包含调试信息的节之外，剩下所有的节都有这个标志。

LOAD 节会从文件中加载到子进程内存中。预初始化的代码和数据都有设置这个标志，而.bss节没有。

RELOC 节在加载之前需要重新定位。

READONLY 节不能被子进程修改。

CODE 节只包含可执行代码。

DATA 节只包含数据（没有可执行代码）。

ROM 节会驻留在ROM中。

CONSTRUCTOR 节包含用于构造函数/析构函数列表的数据。

HAS_CONTENTS 节不为空。

NEVER_LOAD 是给链接器的指令，不会输出该节。

COFF_SHARED_LIBRARY 会通知链接器此节包含COFF共享库信息。COFF是一种对象文件格式，与ELF类似。ELF是可执行二进制文件的文件格式，而COFF是对象文件的文件格式。

IS_COMMON 节包含通用符号。

示例6.2.5 用下面的命令，我们可以限制输出，只显示包含代码的节：

(gdb) maint info sections CODE

输出是：

Exec file:
    `/tmp/hello', file type elf64-x86-64.
  [10]     0x004003c8->0x004003e2 at 0x000003c8: .init ALLOC LOAD READONLY CODE HAS_CONTENTS
  [11]     0x004003f0->0x00400420 at 0x000003f0: .plt ALLOC LOAD READONLY CODE HAS_CONTENTS
  [12]     0x00400420->0x00400428 at 0x00000420: .plt.got ALLOC LOAD READONLY CODE HAS_CONTENTS
  [13]     0x00400430->0x004005c2 at 0x00000430: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
  [14]     0x004005c4->0x004005cd at 0x000005c4: .fini ALLOC LOAD READONLY CODE HAS_CONTENTS

6.2.3 命令：`info functions`

这个命令列出了所有的函数名称以及它们的加载地址。我们可以用正则表达式来过滤这些名称。

示例6.2.6 执行命令，我们可以得到如下的输出：

(gdb) info functions

All defined functions:
File hello.c:
int main(int, char **);
Non-debugging symbols:
0x00000000004003c8  _init
0x0000000000400400  puts@plt
0x0000000000400410  __libc_start_main@plt
0x0000000000400430  _start
0x0000000000400460  deregister_tm_clones
0x00000000004004a0  register_tm_clones
0x00000000004004e0  __do_global_dtors_aux
0x0000000000400500  frame_dummy
0x0000000000400550  __libc_csu_init
0x00000000004005c0  __libc_csu_fini
0x00000000004005c4  _fini

6.2.3 命令：`info variables`

这个命令列出了所有的全局变量和静态变量的名称，它们也可以用正则表达式进行过滤。

示例6.2.7 如果我们在示例源程序中添加一个全局变量int i并重新编译，然后运行该命令，我们会得到以下输出：

(gdb) info variables

All defined variables:
File hello.c:
int i;
Non-debugging symbols:
0x00000000004005d0  _IO_stdin_used
0x00000000004005e4  __GNU_EH_FRAME_HDR
0x0000000000400708  __FRAME_END__
0x0000000000600e10  __frame_dummy_init_array_entry
0x0000000000600e10  __init_array_start
0x0000000000600e18  __do_global_dtors_aux_fini_array_entry
0x0000000000600e18  __init_array_end
0x0000000000600e20  __JCR_END__
0x0000000000600e20  __JCR_LIST__
0x0000000000600e28  _DYNAMIC
0x0000000000601000  _GLOBAL_OFFSET_TABLE_
0x0000000000601028  __data_start
0x0000000000601028  data_start
0x0000000000601030  __dso_handle
0x000000000060103c  __bss_start
0x000000000060103c  _edata
0x000000000060103c  completed
0x0000000000601040  __TMC_END__
0x0000000000601040  _end

6.2.4 命令：`disassemble/disas`

这个命令显示可执行文件的汇编代码。

示例6.2.8 gdb可以显示一个函数的汇编代码：

(gdb) disassemble main

Dump of assembler code for function main:
    0x0804840b <+0>:  lea    ecx,[esp+0x4]
    0x0804840f <+4>:  and    esp,0xfffffff0
    0x08048412 <+7>:  push   DWORD PTR [ecx-0x4]
    0x08048415 <+10>: push   ebp
    0x08048416 <+11>: mov    ebp,esp
    0x08048418 <+13>: push   ecx
    0x08048419 <+14>: sub    esp,0x4
    0x0804841c <+17>: sub    esp,0xc
    0x0804841f <+20>: push   0x80484c0
    0x08048424 <+25>: call   0x80482e0 <puts@plt>
    0x08048429 <+30>: add    esp,0x10
    0x0804842c <+33>: mov    eax,0x0
    0x08048431 <+38>: mov    ecx,DWORD PTR [ebp-0x4]
    0x08048434 <+41>: leave  
    0x08048435 <+42>: lea    esp,[ecx-0x4]
    0x08048438 <+45>: ret    
End of assembler dump.

如果包含了源代码，这个命令会更有用：

(gdb) disassemble /s main

Dump of assembler code for function main:
hello.c:
4 {
    0x0804840b <+0>:  lea     ecx,[esp+0x4]
    0x0804840f <+4>:  and     esp,0xfffffff0
    0x08048412 <+7>:  push    DWORD PTR [ecx-0x4]
    0x08048415 <+10>: push   ebp
    0x08048416 <+11>: mov    ebp,esp
    0x08048418 <+13>: push   ecx
    0x08048419 <+14>: sub    esp,0x4
5     printf("Hello World!\n");
    0x0804841c <+17>: sub    esp,0xc
    0x0804841f <+20>: push   0x80484c0
    0x08048424 <+25>: call   0x80482e0 <puts@plt>
    0x08048429 <+30>: add    esp,0x10
6     return 0;
    0x0804842c <+33>: mov    eax,0x0
7 }
    0x08048431 <+38>: mov    ecx,DWORD PTR [ebp-0x4]
    0x08048434 <+41>: leave  
    0x08048435 <+42>: lea    esp,[ecx-0x4]
    0x08048438 <+45>: ret    
End of assembler dump.

这时，高阶的源代码（绿色文本）作为汇编的一部分被包含了进来。每一行代码下方都有对应的汇编代码。

示例6.2.10 如果加上/r选项，十六进制的原始指令也会包含进来，就像objdump默认会显示汇编代码一样：

(gdb) disassemble /rs main

Dump of assembler code for function main:
hello.c:
4 {
    0x0804840b <+0>:  8d 4c 24 04     lea    ecx,[esp+0x4]
    0x0804840f <+4>:  83 e4 f0        and    esp,0xfffffff0
    0x08048412 <+7>:  ff 71 fc        push   DWORD PTR [ecx-0x4]
    0x08048415 <+10>: 55              push   ebp
    0x08048416 <+11>: 89 e5           mov    ebp,esp
    0x08048418 <+13>: 51              push   ecx
    0x08048419 <+14>: 83 ec 04        sub    esp,0x4
5     printf("Hello World!\n");
    0x0804841c <+17>: 83 ec 0c        sub    esp,0xc
    0x0804841f <+20>: 68 c0 84 04 08  push   0x80484c0
    0x08048424 <+25>: e8 b7 fe ff ff  call   0x80482e0 <puts@plt>
    0x08048429 <+30>: 83 c4 10        add    esp,0x10
6     return 0;
    0x0804842c <+33>: b8 00 00 00 00  mov    eax,0x0
7 }
    0x08048431 <+38>: 8b 4d fc        mov    ecx,DWORD PTR [ebp-0x4]
    0x08048434 <+41>: c9              leave  
    0x08048435 <+42>: 8d 61 fc        lea    esp,[ecx-0x4]
    0x08048438 <+45>: c3              ret    
End of assembler dump.

示例6.2.11 也可以指明输出某一个文件内的某个函数：

(gdb) disassemble /sr 'hello.c'::main

Dump of assembler code for function main:
hello.c:
4 {
    0x0804840b <+0>:  8d 4c 24 04     lea    ecx,[esp+0x4]
    0x0804840f <+4>:  83 e4 f0        and    esp,0xfffffff0
    0x08048412 <+7>:  ff 71 fc        push   DWORD PTR [ecx-0x4]
    0x08048415 <+10>: 55              push   ebp
    0x08048416 <+11>: 89 e5           mov    ebp,esp
    0x08048418 <+13>: 51              push   ecx
    0x08048419 <+14>: 83 ec 04        sub    esp,0x4
5     printf("Hello World!\n");
    0x0804841c <+17>: 83 ec 0c        sub    esp,0xc
    0x0804841f <+20>: 68 c0 84 04 08  push   0x80484c0
    0x08048424 <+25>: e8 b7 fe ff ff  call   0x80482e0 <puts@plt>
    0x08048429 <+30>: 83 c4 10        add    esp,0x10
6     return 0;
    0x0804842c <+33>: b8 00 00 00 00  mov    eax,0x0
7 }
    0x08048431 <+38>: 8b 4d fc        mov    ecx,DWORD PTR [ebp-0x4]
    0x08048434 <+41>: c9              leave  
    0x08048435 <+42>: 8d 61 fc        lea    esp,[ecx-0x4]
    0x08048438 <+45>: c3              ret    
End of assembler dump.

文件名必须包含在单引号中，且函数必须以双冒号为前缀，比如'hello.c'::main指定对文件hello.c中的函数main进行反汇编。

6.2.6 命令：`x`

这个命令可以检查指定内存范围的内容。

示例6.2.12 我们可以检查main的原始内容：

(gdb) x main

0x804840b <main>: 0x04244c8d

默认情况下，不带任何参数的这个命令只打印单个内存地址的内容。示例中，就只打印了main的起始内存地址。

示例6.2.13 有了格式参数，这个命令可以以特定的格式打印某个范围内的内存：

(gdb) x/20b main

0x804840b <main>:     0x8d  0x4c  0x24  0x04  0x83  0xe4  0xf0  0xff
0x8048413 <main+8>:   0x71  0xfc  0x55  0x89  0xe5  0x51  0x83  0xec
0x804841b <main+16>:  0x04  0x83  0xec  0x0c

/20b main参数意思是命令从main的位置开始打印20个字节

格式参数的一般形式是：/<repeated count><format letter>

如果没有提供重复计数，gdb默认提供的计数为1。格式字母可以是下列之一：

字母	描述
o	以八进制格式打印内存里的内容。
x	以十六进制格式打印内存里的内容。
d	以十进制格式打印内存里的内容。
u	以无符号十进制格式打印内存里的内容。
t	以二进制格式打印内存里的内容。
f	以浮点格式打印内存里的内容。
a	以内存地址格式打印内存里的内容。
i	以一系列汇编指令的格式打印内存里的内容。
c	以ASCII码数组格式打印内存里的内容。
s	以字符串格式打印内存里的内容。

在不同的场景下，某些格式要比其他格式更有优势。如果一个内存区域包含的都是浮点数字，那么使用f格式要比把数字看成分离的单字节十六进制数字要好得多。

6.2.7 命令：`print/p`

检查原始内存是很有用的，然而通常情况下，最好可以有更容易被人阅读的输出。这个命令就是用来完成这个任务的：它可以完美输出一个表达式。表达式可以是一个全局变量，也可以是当前堆栈框架中的一个局部变量，还可以是一个函数、一个寄存器、一个数字等等。

6.3 程序的运行时检查

调试器的主要用途是在程序运行时检查它的状态。gdb提供了一套有用的命令来检索有用的运行时信息。

6.3.1 命令：`run`

这个命令启动目标程序。

示例6.3.1 运行hello程序：

(gdb) r

Starting program: /tmp/hello 
Hello World!
[Inferior 1 (process 1002) exited normally]

程序成功运行，并打印出了“Hello World”的信息。然而，如果gdb能做的只是运行一个程序，那就没有什么用了。

6.3.2 命令：`break/b`

这个命令在高阶源代码的某个位置设置一个断点。当gdb运行到断点标记的特定位置时，它会停止执行，以便程序员检查程序的当前状态。

示例6.3.2 断点可以设置在编辑器所显示的行上。假设我们想在程序的第3行设置一个断点，也就是main函数的开始位置：

1 #include <stdio.h>
2 
3 int main(int argc, char *argv[])
4 {
5     printf("Hello World!\n");
6     return 0;
7 }

运行程序，gdb不是从头到尾运行，而是在第3行停下来：

  (gdb) b 3

Breakpoint 1 at 0x400535: file hello.c, line 3.

(gdb) r

Starting program: /tmp/hello 
Breakpoint 1, main (argc=1, argv=0x7fffffffdfb8) at hello.c:5
5     printf("Hello World!\n");

断点在第3行，但gdb停在第5行。原因是第3行不包含代码，而是一个函数签名；gdb只在它能执行代码的地方停止。函数中的代码从第5行开始，即对printf的调用，所以gdb在那里停止。

示例6.3.3 代码的行号并不总是指定断点的可靠方法，因为源代码会改变。如果想要gdb总是停在main函数上该怎么办？在这种情况下，更好的方法是直接使用函数名：

b main

于是无论源代码如何变化，gdb总是停在主函数上。

示例6.3.4 有时，调试程序没有包含调试信息，或者gdb正在调试汇编代码。在这种情况下，可以指定一个内存地址作为停止点。为了得到这个函数地址，可以使用print命令:

(gdb) print main

$3 = {int (int, char **)} 0x400526 <main>

知道了main的地址，我们可以很容易地用内存地址设置一个断点：

b *0x400526

示例6.3.5 gdb还可以在任何源文件中设置断点。假设hello程序不是由单个文件而是由许多文件组成，例如hello1.c、hello2.c、hello3.c...在这种情况下，只需在行号前加上文件名：

b hello.c:3

示例6.3.6 也可以指定特定文件中的函数名：

b hello.c:main

6.3.3 命令：`next/n`

这个命令执行当前行，然后在下一行停止。如果当前行是一个函数调用，则单步执行它。

示例6.3.7 在main处设置断点后，运行一个程序并在第一个printf处停止:

(gdb) r

Starting program: /tmp/hello 
Breakpoint 1, main (argc=1, argv=0x7fffffffdfb8) at hello.c:5
5     printf("Hello World!\n");

接着，为了前进到下一个语句，我们使用next命令：

(gdb) n

Hello World!
6     return 0;

在输出中，第一行显示了执行第5行后产生的输出；然后，下一行显示了gdb当前停止的地方，也就是第6行。

6.3.4 命令：`step/s`

这个命令执行当前行，然后在下一行停止。如果当前行是一个函数调用，则进入该函数，在被调用函数的首行停止。

示例6.3.8 假设我们有一个新的函数add^{为什么这里选择添加一个新的函数以及对应函数调用，而不是使用
现成的printf？调试共享库函数是很棘手的：为了调试，必须安装、加载调试信息。为了演示这个简单的命令，不值得这么麻烦。}：

#include <stdio.h>

int add(int a, int b) {
    return a + b;
}

int main(int argc, char *argv[]) {
    add(1, 2);
    printf("Hello World!\n");
    return 0;
}

如果在函数调用printf时使用step命令而不是next命令，gdb会步进到这个函数的内部：

(gdb) r

Starting program: /tmp/hello 
Breakpoint 1, main (argc=1, argv=0xffffd154) at hello.c:11
11    add(1, 2);

(gdb) s

add (a=1, b=2) at hello.c:6
6     return a + b;

在执行了命令s之后，gdb步入了add函数，它的第一条语句是返回语句。

6.3.5 命令：`ni`

本质上，gdb是在操作汇编指令。面向源代码的逐行调试只是一项改进，使得调试对程序员更友好。就如objdump和disassemble命令所演示的，Ｃ语言中的每条语句都转化为了一条或是多条汇编指令。借助可用的调试信息，gdb知道某几条指令属于同一行高阶代码；逐行调试只是在从当前行移到下一行时执行这一行的汇编指令。

这个命令执行属于当前行的一条汇编指令。在当前行的所有汇编指令执行结束以前，gdb不会移动到下一行。如果当前指令是函数调用，则前进到下一条指令。

示例6.3.9 当断点在printf调用上时使用ni命令，它将步入每条汇编指令：

(gdb) disassemble /s main

Dump of assembler code for function main:
hello.c:
4 {
    0x0804840b <+0>:    lea    ecx,[esp+0x4]
    0x0804840f <+4>:    and    esp,0xfffffff0
    0x08048412 <+7>:    push   DWORD PTR [ecx-0x4]
    0x08048415 <+10>:   push   ebp
    0x08048416 <+11>:   mov    ebp,esp
    0x08048418 <+13>:   push   ecx
    0x08048419 <+14>:   sub    esp,0x4
5     printf("Hello World!\n");
    0x0804841c <+17>:   sub    esp,0xc
    0x0804841f <+20>:   push   0x80484c0
    0x08048424 <+25>:   call   0x80482e0 <puts@plt>
    0x08048429 <+30>:   add    esp,0x10
6     return 0;
=>  0x0804842c <+33>:   mov    eax,0x0
7 }
    0x08048431 <+38>:   mov    ecx,DWORD PTR [ebp-0x4]
    0x08048434 <+41>:   leave  
    0x08048435 <+42>:   lea    esp,[ecx-0x4]
    0x08048438 <+45>:   ret    
End of assembler dump.

(gdb) r

Starting program: /tmp/hello 
Breakpoint 1, main (argc=1, argv=0xffffd154) at hello.c:5
5	    printf("Hello World!\n");

(gdb) ni

0x0804841f  5     printf("Hello World!\n");

(gdb) ni

0x08048424  5     printf("Hello World!\n");

(gdb) ni

Hello World!
0x08048429  5     printf("Hello World!\n");

(gdb)

6     return 0;

进入ni后，gdb执行当前指令并显示下一条指令。这就是为什么在上边的输出中，gdb只显示了3个地址：0x0804841f、0x08048424和0x08048429。位于0x0804841c的指令，也就是printf的第一条指令之所以没有显示，是因为它是gdb停止的第一条指令。假设gdb在printf的第一条指令0x0804841c处停止，可以用x命令显示当前指令：

(gdb) x/i $eip

=> 0x804841c <main+17>: sub    esp,0xc

6.3.6 命令：`si`

与ni相似，这个命令执行属于当前行的当前汇编指令。但如果当前指令是一个函数调用，则进入函数，准备执行被调用函数的第一条指令。

示例6.3.10 回忆一下，从printf生成的汇编代码包含一条调用指令。

(gdb) disassemble /s main

Dump of assembler code for function main:
hello.c:
4 {
    0x0804840b <+0>:  lea    ecx,[esp+0x4]
    0x0804840f <+4>:  and    esp,0xfffffff0
    0x08048412 <+7>:  push   DWORD PTR [ecx-0x4]
    0x08048415 <+10>: push   ebp
    0x08048416 <+11>: mov    ebp,esp
    0x08048418 <+13>: push   ecx
    0x08048419 <+14>: sub    esp,0x4
5     printf("Hello World!\n");
    0x0804841c <+17>: sub    esp,0xc
    0x0804841f <+20>: push   0x80484c0
    0x08048424 <+25>: call   0x80482e0 <puts@plt>
    0x08048429 <+30>: add    esp,0x10
6     return 0;
=> 0x0804842c <+33>:  mov    eax,0x0
7 }
    0x08048431 <+38>: mov    ecx,DWORD PTR [ebp-0x4]
    0x08048434 <+41>: leave  
    0x08048435 <+42>: lea    esp,[ecx-0x4]
    0x08048438 <+45>: ret    
End of assembler dump.

我们再次尝试逐条指令的步进，但这次是在0x08048424处运行si，这是函数调用所在的位置：

(gdb) si

0x0804841f  5         printf("Hello World!\n");

(gdb) si

0x08048424  5         printf("Hello World!\n");

(gdb) x/i $eip

=> 0x8048424 <main+25>:	call   0x80482e0 <puts@plt>

(gdb) si

0x080482e0  in puts@plt ()

紧接着0x8048424的下一条指令是puts函数的第一条指令，位于0x080482e0。换句话说，gdb步进到了puts函数，而不是跳过了它。

6.3.7 命令：`until`

这个命令持续执行，直到到达当前堆栈帧中超过当前行的行。

示例6.3.11 假设我们有一个函数，执行一个长循环：

#include <stdio.h>

int add1000() {
    int total = 0;

    for (int i = 0; i < 1000; ++i){
        total += i;
    }

    printf("Done adding!\n");

    return total;
}

int main(int argc, char *argv[])
{
    add1000(1, 2);
    printf("Hello World!\n");
    return 0;
}

使用next命令，需要执行1000次完成这个循环。然而一个更快的方法是使用until：

(gdb) b add1000

Breakpoint 1 at 0x8048411: file hello.c, line 4.

(gdb) r

Starting program: /tmp/hello 
Breakpoint 1, add1000 () at hello.c:4
4     int total = 0;

(gdb) until

5     for (int i = 0; i < 1000; ++i){

(gdb) until

6         total += i;

(gdb) until

5     for (int i = 0; i < 1000; ++i){

(gdb) until

8     printf("Done adding!\n");

执行第一个until，gdb停在第5行，因为第5行比第4行大。

执行第二个until，gdb在第6行停止，因为第6行比第5行大。

执行第三个until，gdb停在第5行，因为循环仍在继续。因为第5行小于第6行，在执行第四个until时，gdb一直执行到不再回到第5行，并在第8行停止。这是一个跳过循环的好方法，而不是去设置不必要的断点。

示例6.3.12 可以提供until一个参数来明确地执行到某个特定的行。

(gdb) r

Starting program: /tmp/hello 
Breakpoint 1, add1000 () at hello.c:4
4     int total = 0;

(gdb) until 8

add1000 () at hello.c:8
8     printf("Done adding!\n");

189

Command: finish

This command executes until the end of a function and displays the return value. finish is actually just a more convenient version of until.

Using the add1000 function from the previous example and use finish instead of until:

(gdb) r

Starting program: /tmp/hello

Breakpoint 1, add1000 () at hello.c:4

4 int total = 0;

(gdb) finish

Run till exit from #0 add1000 () at hello.c:4

Done adding!

0x08048466 in main (argc=1, argv=0xffffd154) at hello.c:15

15 add1000(1, 2);

Value returned is $1 = 499500

Command: bt

This command prints the backtrace of all stack frames. A [margin: backtrace ]backtracebacktrace is a list of currently active functions:

Suppose we have a chain of function calls:

void d(int d) { };

void c(int c) { d(0); }

void b(int b) { c(1); }

void a(int a) { b(2); }

int main(int argc, char *argv[])

{

a(3);

return 0;

}

bt can visualize such a chain in action:

(gdb) b a

Breakpoint 1 at 0x8048404: file hello.c, line 9.

(gdb) r

Starting program: /tmp/hello

Breakpoint 1, a (a=3) at hello.c:9

9 void a(int a) { b(2); }

(gdb) s

b (b=2) at hello.c:7

7 void b(int b) { c(1); }

(gdb) s

c (c=1) at hello.c:5

5 void c(int c) { d(0); }

(gdb) s

d (d=0) at hello.c:3

3 void d(int d) { };

(gdb) bt

#0 d (d=0) at hello.c:3

#1 0x080483eb in c (c=1) at hello.c:5

#2 0x080483fb in b (b=2) at hello.c:7

#3 0x0804840b in a (a=3) at hello.c:9

#4 0x0804841b in main (argc=1, argv=0xffffd154) at hello.c:13

Most-recent calls are placed on top and least-recent calls are near the bottom. In this case, d is the most current active function, so it has the index 0. Next is c, the 2[superscript:nd] active function, has the index 1 and so on with function b, function a, and finally function main at the bottom, the least-recent function. That is how we read a backtrace.

Command: up

This command goes up one frame earlier the current frame.

Instead of staying in d function, we can go up to c function and look at its state:

(gdb) bt

#0 d (d=0) at hello.c:3

#1 0x080483eb in c (c=1) at hello.c:5

#2 0x080483fb in b (b=2) at hello.c:7

#3 0x0804840b in a (a=3) at hello.c:9

#4 0x0804841b in main (argc=1, argv=0xffffd154) at hello.c:13

(gdb) up

#1 0x080483eb in c (c=1) at hello.c:3

3 void b(int b) { c(1); }

The output displays the current frame is moved to c and where the call to c is made, which is in function b at line 3.

Command: down

Similar to up, this command goes down one frame later then the current frame.

After inspecting c function, we can go back to d:

(gdb) bt

#0 d (d=0) at hello.c:3

#1 0x080483eb in c (c=1) at hello.c:5

#2 0x080483fb in b (b=2) at hello.c:7

#3 0x0804840b in a (a=3) at hello.c:9

#4 0x0804841b in main (argc=1, argv=0xffffd154) at hello.c:13

(gdb) up

#1 0x080483eb in c (c=1) at hello.c:3

3 void b(int b) { c(1); }

(gdb) down

#0 d (d=0) at hello.c:1

1 void d(int d) { };

Command: info registers

This command lists the current values in commonly used registers. This command is useful when debugging assembly and operating system code, as we can inspect the current state of the machine.

Executing the command, we can see the commonly used registers:

(gdb) info registers

eax 0xf7faddbc -134554180

ecx 0xffffd0c0 -12096

edx 0xffffd0e4 -12060

ebx 0x0 0

esp 0xffffd0a0 0xffffd0a0

ebp 0xffffd0a8 0xffffd0a8

esi 0xf7fac000 -134561792

edi 0xf7fac000 -134561792

eip 0x804841c 0x804841c <main+17>

eflags 0x286 [ PF SF IF ]

cs 0x23 35

ss 0x2b 43

ds 0x2b 43

es 0x2b 43

fs 0x0 0

gs 0x63 99

The above registers suffice for writing our operating system in later part.

How debuggers work: A brief introduction

How breakpoints work

When a programmer places a breakpoint somewhere in his code, what actually happens is that the first opcode of the first instruction of a statement is replaced with another instruction, int 3 with opcode CCh:

[float Figure: [Figure 0.17: Opcode replacement, with int 3 ]

+-----+-----+---------+ +-----+-----+----+ | 83 | ec | 0c | \rightarrow | cc | ec | 0c | +-----+-----+---------+ +-----+-----+----+ +---------------------+ +----------------+ | sub esp,0x4 | | int 3 | +---------------------+ +----------------+

]

int 3 only costs a single byte, making it efficient for debugging. When int 3 instruction is executed, the operating system calls its breakpoint interrupt handler. The handler then checks what process reaches a breakpoint, pauses it and notifies the debugger it has paused a debugged process. The debugged process is only paused and that means a debugger is free to inspect its internal state, like a surgeon operates on an anesthetic patient. Then, the debugger replaces the int 3 opcode with the original opcode and executes the original instruction normally.

[float Figure: [Figure 0.18: Restore the original opcode, after int 3 was executed ]

+-----+-----+----+ +-----+-----+---------+ | cc | ec | 0c | \rightarrow | 83 | ec | 0c | +-----+-----+----+ +-----+-----+---------+ +----------------+ +---------------------+ | int 3 | | sub esp,0x4 | +----------------+ +---------------------+

]

It is simple to see int 3 in action. First, we add an int 3 instruction where we need gdb to stop:

#include <stdio.h>

int main(int argc, char *argv[])

{

asm("int 3");

printf("Hello World\n");

return 0;

}

int 3 precedes printf, so gdb is expected to stop at printf. Next, we compile with debug enable and with Intel syntax:

$ gcc -masm=intel -m32 -g hello.c -o hello

Finally, start gdb:

$ gdb hello

Running without setting any breakpoint, gdb stops at printf call, as expected:

(gdb) r

Starting program: /tmp/hello

Program received signal SIGTRAP, Trace/breakpoint trap.

main (argc=1, argv=0xffffd154) at hello.c:6

6 printf("Hello World\n");

The blue text indicates that gdb encountered a breakpoint, and indeed it stopped at the right place: the printf call, where int 3 preceded it.

Single stepping

When breakpoint is implemented, it is easy to implement single stepping: a debugger simply places another int 3 opcode in the next instruction. So, when a programmer sets a breakpoint at an instruction, the next instruction is automatically set by the debugger, thus enable instruction by instruction debugging. Similarly, source line by line debugging is just the placements of the very first opcodes in the two statements with two int 3 opcodes.

How a debugger understands high level source code

DWARF is a debugging file format used by many compilers and debuggers to support source level debugging. DWARF contains information that maps between entities in the executable binary with the source files. A program entity can either be data or code. A DIE, or [margin: Debugging Information Entry ]Debugging Information EntryDebugging Information Entry, is a description of a program entity. A DIE consists of a tag, which specifies the entity that the DIE describes, and a list of attributes that describes the entity. Of all the attributes, these two attributes enables source-level debugging:

• Where the entity appears in the source files: which file and which line the entity appears.

• Where the entity appears in the executable binary: in which memory address the entity is loaded at runtime. With the precise address, gdb can retrieve correct value for a data entity, or place a correct breakpoint and stop accordingly for a code entity. Without the information of these addresses, gdb would not know where the entities are to inspect them.

+---------------------------------------------------------------------------------------------------------------------------------------------------+ +------------------------------------------------------------------+ | hello.c | | DIE | +---------------------------------------------------------------------------------------------------------------------------------------------------+ +------------------------------------------------------------------+ +--------------------------------------------------------------+------------------------------------------------------------------------------------+ +------------------------------------------------------------------+ | Line 1

Line 2

\Rightarrow Line 3

Line 5

Line 6 | #include <stdio.h>

int main(int argc, char *argv[])

..........

|

\rightarrow | ....

....

main in hello.c is at 0x804840b in hello

....

.... | +--------------------------------------------------------------+------------------------------------------------------------------------------------+ +------------------------------------------------------------------+

                                                                                                                                                                                                 \downarrow

\uparrow

                                                                                                                                                                           +------------------------------------------------------------------+
                                                                                                                                                                           | hello (at 0x804840b)                                             |
                                                                                                                                                                           +------------------------------------------------------------------+
                                                                                                                                                                           +------------------------------------------------------------------+
                                                                                                                                                                           | ...8d 4c 24 04 83 e4 f0 ff 71 fc ....                            |
                                                                                                                                                                           +------------------------------------------------------------------+

In addition to DIEs, another binary-to-source mapping is the line number table. The line number table maps between a line in the source code and at which memory address is the start of the line in the executable binary.

In sum, to successfully enable source-level debugging, a debugger needs to know the precise location of the source files and the load addresses at runtime. Address matching, between the image layout of the ELF binary and the address where it is loaded, is extremely important since debug information relies on correct loading address at runtime. That is, it assumes the addresses as recorded in the binary image at compile-time the same as at runtime e.g. if the load address for .text section is recorded in the executable binary at 0x800000, then when the binary actually runs, .text should really be loaded at 0x800000 for gdb to be able to correctly match running instructions with high-level code statement. Address mismatching makes debug information useless, as actual code at one address is displayed as code at another address. Without this knowledge, we will not be able to build an operating system that can be debugged with gdb.

When an executable binary contains debug info, readelf can display such information in a readable format. Using the good old hello world program:

#include <stdio.h>

int main(int argc, char *argv[])

{

printf("Hello World\n");



return 0;

}

and compile with debug info:

$ gcc -m32 -g hello.c -o hello

With the binary ready, we can look at the line number table with the command:

$ readlelf -wL hello

-w option prints all the debug information. In combination with its sub-option, only specific information is displayed. For example, with -L, only the line number table is displayed:

Decoded dump of debug contents of section .debug_line:

CU: hello.c:

File name Line number Starting address

hello.c 6
0x804840b

hello.c 7
0x804841c

hello.c 9
0x804842c

hello.c 10
0x8048431

From the above output:

CU shorts for Compilation Unit, a separately compiled source file. In the example, we only have one file, hello.c.

File name displays the filename of the current compilation unit.

Line number is the line number in the source file of which the line is not an empty line. In the example, line 8 is an empty line, so it does not appear.

Starting address is the memory address where the line actually starts in the executable binary.

With such crystal clear information, this is how gdb is able to set a breakpoint on a line easily. For placing breakpoints on variables and functions, it is time to look at the DIEs. To get the DIEs information from an executable binary, run the command:

$ readlelf -wi hello

-wi option lists all the DIE entries. This is one typical DIE entry:

<0>: Abbrev Number: 1 (DW_TAG_compile_unit)

<c> DW_AT_producer : (indirect string, offset: 0xe):

GNU C11 5.4.0 20160609 -masm=intel -m32 -mtune=generic -march=i686 -g -fstack-protector-strong

<10> DW_AT_language : 12 (ANSI C99) <11> DW_AT_name : (indirect string, offset: 0xbe):

hello.c

<15> DW_AT_comp_dir : (indirect string, offset: 0x97):

/tmp

<19> DW_AT_low_pc : 0x804840b <1d> DW_AT_high_pc : 0x2e <21> DW_AT_stmt_list : 0x0

Red This left-most number indicates the current nesting level of a DIE entry. 0 is the outer-most level DIE with its entity is the compilation unit. This means subsequent DIE entries with higher nesting level are all the children of this tag, the compilation unit. It makes sense, as all the entities must originate from a source file.

Blue These numbers in hex format indicate the offsets into .debug_info section. Each meaningful information is displayed along with its offset. When an attribute references to another attribute, the offset is used to precisely identify the referenced attribute.

Green These names with DW_AT_ prefix are the attributes attached to a DIE that describe an entity. Notable attributes:

DW_AT_name DW_AT_comp_dir The filename of the compilation unit and the directory where compilation occurred. Without the filename and the path, gdb would not be able to display the high-level source, despite the availability of the debug info. Debug info only contains the mapping between source and binary, not the source code itself. DW_AT_low_pc DW_AT_high_pc The start and end of the current entity, which is the compilation unit, in the executable binary. The value in DW_AT_low_pc is the starting address. DW_AT_high_pc is the size of the compilation unit, when adding up to DW_AT_low_pc results in the end address of the entity. In this example, code compiled from hello.c starts at 0x804840b and end at \mathtt{0x804840b+0x2e=0x8048439}

. To really make sure, we verify with objdump:

int main(int argc, char *argv[]) { 804840b: 8d 4c 24 04 lea ecx,[esp+0x4] 804840f: 83 e4 f0 and esp,0xfffffff0 8048412: ff 71 fc push DWORD PTR [ecx-0x4] 8048415: 55 push ebp 8048416: 89 e5 mov ebp,esp 8048418: 51 push ecx 8048419: 83 ec 04 sub esp,0x4 printf("Hello World\n"); 804841c: 83 ec 0c sub esp,0xc 804841f: 68 c0 84 04 08 push 0x80484c0 8048424: e8 b7 fe ff ff call 80482e0 <puts@plt> 8048429: 83 c4 10 add esp,0x10 return 0; 804842c: b8 00 00 00 00 mov eax,0x0 } 8048431: 8b 4d fc mov ecx,DWORD PTR [ebp-0x4] 8048434: c9 leave 8048435: 8d 61 fc lea esp,[ecx-0x4] 8048438: c3 ret 8048439: 66 90 xchg ax,ax 804843b: 66 90 xchg ax,ax 804843d: 66 90 xchg ax,ax 804843f: 90 nop It is true: main starts at 804840b and end at 8048439, right after the ret instruction at 8048438. The instructions after 8048439 are just padding bytes inserted by gcc for alignment, which do not belong to main. Note that the output from objdump shows much more code past main. It is not counted, as the code is outside of hello.c, added by gcc for the operating system. hello.c contains only one function: main and this is why hello.c also starts and ends the same as main.

Pink This number displays the abbreviation form of a tag. An abbreviation is the form of a DIE. When debug info is displayed with -wi, the DIEs are displayed with their values. -wa option shows abbreviations in the .debug_abbrev section:

Contents of the .debug_abbrev section: Number TAG (0x0) 1 DW_TAG_compile_unit [has children] DW_AT_producer DW_FORM_strp DW_AT_language DW_FORM_data1 DW_AT_name DW_FORM_strp DW_AT_comp_dir DW_FORM_strp DW_AT_low_pc DW_FORM_addr DW_AT_high_pc DW_FORM_data4 DW_AT_stmt_list DW_FORM_sec_offset DW_AT value: 0 DW_FORM value: 0 .... more abbreviations .... The output is similar to a DIE output, with only attribute names and without any value. We can also say an abbreviation is a type of a DIE, as an abbreviation represents the structure of a particular DIE. Many DIEs share the same abbreviation, or structure, thus they are of the same type. An abbreviation number specifies which type a DIE is in the abbreviation table above. Abbreviations improve encoding efficiency (reduce binary size) because each DIE needs not to carry their structure information as pairs of attribute-value[footnote:

For example, data format such as YAML or JSON encodes its attribute names along with its values. This simplifies encoding, but with overhead. ], but simply refers to an abbreviation for correct decoding.

Here are all the DIEs of hello represented as a tree:

<Graphics file: C:/Users/Tu Do/os01/book_src/images/06/dwarf_tree.svg> <dwarf_tree>

In the figure [dwarf_tree], DW_TAG_subprogram represents a function such as main. Its children are the DIEs of argc and argv. With such precise information, matching source to binary is an easy job for gdb.

If more than one compilation units exist in an executable binary, the DIE entries are sorted according to the compilation order from gcc. For example, suppose we have another test.c source file[footnote: It can contain anything. Just a sample file. ] and compile it together with hello:

$ gcc -masm=intel -m32 -g test.c hello.c -o hello

Then, the all DIE entries in test.c are displayed before the DIE entries in hello.c:

<0>: Abbrev Number: 1 (DW_TAG_compile_unit)

<c> DW_AT_producer : (indirect string, offset: 0x0):

GNU C11 5.4.0 20160609

-masm=intel -m32 -mtune=generic -march=i686 -g -fstack-protector-strong

<10> DW_AT_language : 12 (ANSI C99) <11> DW_AT_name : (indirect string, offset: 0x64):

test.c

<15> DW_AT_comp_dir : (indirect string, offset: 0x5f):

/tmp

<19> DW_AT_low_pc : 0x804840b <1d> DW_AT_high_pc : 0x6 <21> DW_AT_stmt_list : 0x0

<1><25>: Abbrev Number: 2 (DW_TAG_subprogram)

<26> DW_AT_external : 1 <26> DW_AT_name : bar <2a> DW_AT_decl_file : 1 <2b> DW_AT_decl_line : 1 <2c> DW_AT_low_pc : 0x804840b <30> DW_AT_high_pc : 0x6 <34> DW_AT_frame_base : 1 byte block: 9c

(DW_OP_call_frame_cfa)

<36> DW_AT_GNU_all_call_sites: 1

....after all DIEs in test.c listed....

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1-Chapter-6.md

1-Chapter-6.md

第六章运行时的检查与调试

6.1 一个示例程序

6.2 对一个程序进行静态检查

6.2.1 命令：`info target`/`info file`/`info files`

6.2.2 命令：`maint info sections`

6.2.3 命令：`info functions`

6.2.3 命令：`info variables`

6.2.4 命令：`disassemble/disas`

6.2.6 命令：`x`

6.2.7 命令：`print/p`

6.3 程序的运行时检查

6.3.1 命令：`run`

6.3.2 命令：`break/b`

6.3.3 命令：`next/n`

6.3.4 命令：`step/s`

6.3.5 命令：`ni`

6.3.6 命令：`si`

6.3.7 命令：`until`

Files

1-Chapter-6.md

Latest commit

History

1-Chapter-6.md

File metadata and controls

第六章 运行时的检查与调试

6.1 一个示例程序

6.2 对一个程序进行静态检查

6.2.1 命令：info target/info file/info files

6.2.2 命令：maint info sections

6.2.3 命令：info functions

6.2.3 命令：info variables

6.2.4 命令：disassemble/disas

6.2.6 命令：x

6.2.7 命令：print/p

6.3 程序的运行时检查

6.3.1 命令：run

6.3.2 命令：break/b

6.3.3 命令：next/n

6.3.4 命令：step/s

6.3.5 命令：ni

6.3.6 命令：si

6.3.7 命令：until

第六章运行时的检查与调试

6.2.1 命令：`info target`/`info file`/`info files`

6.2.2 命令：`maint info sections`

6.2.3 命令：`info functions`

6.2.3 命令：`info variables`

6.2.4 命令：`disassemble/disas`

6.2.6 命令：`x`

6.2.7 命令：`print/p`

6.3.1 命令：`run`

6.3.2 命令：`break/b`

6.3.3 命令：`next/n`

6.3.4 命令：`step/s`

6.3.5 命令：`ni`

6.3.6 命令：`si`

6.3.7 命令：`until`