Skip to content

LV040-调用GDB调试器

前面我们知道 gdb 加上可执行程序名就可以启动 gdb 调试器,但这仅是调用 GDB 调试器最常用的一种方式,GDB 调试器还有其它的启动方式。并且,为了满足不同场景的需要,启动 GDB 调试器时还可以使用一些参数选项,从而控制它启动哪些服务或者不启动哪些服务。

一、直接使用指令启动

我们直接输入 gdb 命令,不加可执行程序名:

shell
$ gdb
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb)

此方式启动的 GDB 调试器,由于事先未指定要调试的具体程序,因此需启动后借助 file 或者 exec-file 命令指定(后面会继续学习)。

二、调试未执行的程序

对于具备调试信息(使用 -g 选项编译而成)的可执行文件,调用 GDB 调试器的指令格式为:

shell
gdb program.out

其中,program.out 为可执行文件的文件名,例如前面说的 a.out。

三、调试正在执行的程序

1. 测试程序

c
#include <stdio.h>
#include <unistd.h>

int main(int argc, const char *argv[]) {
    /* code */
    int i = 0;
    printf("Hello, World! \n");

    while (1) {
        i++;
        sleep(1);
        if (i > 100) {
            printf("i=%d\n", i++);
            i = 0;
        }
    }

    return 0;
}

我们通过下面的命令编译:

shell
gcc 010-gdb-debug-running.c -g -Wall

然后让其在后台一直运行:

shell
./a.out &

2. 怎么调试已运行程序

在某些情况下,我们可能想调试一个当前已经启动的程序,但又不想重启该程序,就可以借助 GDB 调试器实现。也就是说,GDB 可以调试正在运行的 C、C++ 程序。

2.1 进程PID

要知道,每个 C 或者 C++ 程序执行时,操作系统会使用 1 个(甚至多个)进程来运行它,并且为了方便管理当前系统中运行的诸多进程,每个进程都配有唯一的进程号(PID)。我们可以使用ps命令或者pidof查看:

shell
ps
pidof program.out

在这里就是:

shell
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ ps
    PID TTY          TIME CMD
   5292 pts/2    00:00:00 bash
   9454 pts/2    00:00:00 a.out
   9559 pts/2    00:00:00 ps
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ pidof a.out
9454

2.2 启动gdb

然后我们使用gdb来对其进行调试,我们执行下面的命令:

shell
gdb attach PID
gdb program.out PID
gdb -p PID

在这里就是:

shell
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ gdb -p 9454
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 9454
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: 不允许的操作.

然后就会发现无法调试,这里是因为Linux 系统的 ptrace 权限限制问题。Ubuntu 等系统默认启用了 Yama 安全模块,限制普通用户使用 ptrace 系统调用附加到进程。这里有两种解决方式:

  • (1)使用 sudo 运行 gdb(推荐临时调试使用)
bash
sudo gdb -p 9454
  • (2)临时修改 ptrace_scope
bash
# 这会将 ptrace 权限设置为允许,但重启后会恢复默认值。
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope

我们使用下面的方式修改,然后再执行gdb:

shell
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ gdb -p 9454 -q
Attaching to process 9454
Reading symbols from /home/sumu/workspace/c-learning/02-c-basic/21-debug/a.out...
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug/.build-id/57/92732f783158c66fb4f3756458ca24e46e827d.debug...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug/.build-id/db/3ae442c4308e6250049fb6159c302cf4274fa2.debug...
0x00007f1d8e8261b4 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, 
    req=req@entry=0x7ffffc40e130, rem=rem@entry=0x7ffffc40e130) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78      ../sysdeps/unix/sysv/linux/clock_nanosleep.c: 没有那个文件或目录.

发现这个时候其实就没有上面的报错了。

2.3 运行到指定位置

我们直接调试已在执行的程序的时候,不知道运行到了哪一行,像上面,应该是运行到sleep函数,然后我们调用gdb就直接停止在休眠函数内部了,也就是 libc 的 clock_nanosleep 函数中(因为程序正在执行 sleep())。以下是几种跳回到自己源代码的方法:

2.3.1 使用 finish 命令

执行完当前函数,返回到调用者:

txt
(gdb) finish

这会执行完 clock_nanosleep,返回到 sleep(),再执行完 sleep() 返回到你的源代码。但是由于没有设置断点,所以其实返回后又迅速进入了休眠函数了,所以并不好用。

2.3.2 移动调用栈

可以使用 up 命令向上移动调用栈

shell
(gdb) up

多次执行 up 直到回到源代码层级。也可以用 bt(backtrace)先查看调用栈:

shell
(gdb) bt
#0  0x00007f1d8e8261b4 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, 
    req=req@entry=0x7ffffc40e130, rem=rem@entry=0x7ffffc40e130) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#1  0x00007f1d8e82bec7 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffffc40e130, 
    remaining=remaining@entry=0x7ffffc40e130) at nanosleep.c:27
#2  0x00007f1d8e82bdfe in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#3  0x000055c7e2a191bd in main (argc=1, argv=0x7ffffc40e288) at 010-gdb-debug-running.c:11

然后直接跳到对应帧:

shell
(gdb) frame 3

这个时候就可以直接用l命令显示源代码了:

shell
(gdb) frame 3
#3  0x000055c7e2a191bd in main (argc=1, argv=0x7ffffc40e288) at 010-gdb-debug-running.c:11
11              sleep(1);
(gdb) l
6           int i = 0;
7           printf("Hello, World! \n");
8
9           while (1) {
10              i++;
11              sleep(1);
12              if (i > 100) {
13                  printf("i=%d\n", i++);
14                  i = 0;
15              }
2.3.3 直接设置断点

在源代码设置断点,然后继续执行:

txt
(gdb) break your_file.c:num
(gdb) continue

比如这里就就是:

shell
(gdb) b 010-gdb-debug-running.c:12
Breakpoint 1 at 0x55c7e2a191bd: file 010-gdb-debug-running.c, line 12.
(gdb) c
Continuing.

Breakpoint 1, main (argc=1, argv=0x7ffffc40e288) at 010-gdb-debug-running.c:12
12              if (i > 100) {

2.4 退出调试

调试完成后,如果想令当前程序进行执行,消除调试操作对它的影响,需手动将 GDB 调试器与程序分离,分离过程分为 2 步:

(1)执行 detach 指令,使 GDB 调试器和程序分离;

(2)执行 quit(或 q)指令,退出 GDB 调试。

四、调试崩溃程序

或者 C++ 程序运行过程中常常会因为各种异常或者 Bug 而崩溃,比如内存访问越界(例如数组下标越界、输出字符串时该字符串没有 \0 结束符等)、非法使用空指针等,此时就需要调试程序。

前面我们已经知道,在 Linux 操作系统中,当程序执行发生异常崩溃时,系统可以将发生崩溃时的内存数据、调用堆栈情况等信息自动记录下载,并存储到一个文件中,该文件通常称为 core 文件,Linux 系统所具备的这种功能又称为核心转储(core dump)。幸运的是,GDB 对 core 文件的分析和调试提供有非常强大的功能支持,当程序发生异常崩溃时,通过 GDB 调试产生的 core 文件,往往可以更快速的解决问题。

1. 测试程序

c
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/resource.h>

int sys_open_coredump(char *pCorePid, char *pCorePath) {
    int           iRet = -1;
    int           iFd1 = -1;
    int           iFd2 = -1;
    struct rlimit limit = {0};
    struct rlimit limit_set = {0};

    /* 1. set core ulimit, ulimit -c coredump_size */
    if (getrlimit(RLIMIT_CORE, &limit)) {
        printf("[error]get resource limit fail!\n");
        return -1;
    }

    limit_set.rlim_cur = RLIM_INFINITY;
    limit_set.rlim_max = RLIM_INFINITY;
    if (setrlimit(RLIMIT_CORE, &limit_set)) {
        limit_set.rlim_cur = limit_set.rlim_max = limit.rlim_max;
        if (limit.rlim_max != RLIM_INFINITY) {
            // printf( "CORE: cur=0x%x, max=0x%x\n",
            // limit.rlim_cur, limit.rlim_max);
        }
        if (setrlimit(RLIMIT_CORE, &limit_set)) {
            printf("[error]set core ulimited fail!\n");
            return -1;
        }
    }

    /* 2. set core use pid */
    if (pCorePid && strlen(pCorePid) > 0) {
        iFd1 = open("/proc/sys/kernel/core_uses_pid", O_RDWR | O_NDELAY | O_TRUNC, 0);
        if (iFd1 < 0) {
            printf("[error]open core_uses_pid fail!\n");
            return -1;
        }
        if (strlen(pCorePid) != write(iFd1, pCorePid, strlen(pCorePid))) {
            printf("[error]set core_uses_pid fail!\n");
            close(iFd1);
            return -1;
        }
        close(iFd1);
    }

    /* 3. set core pattern */
    if (pCorePath && strlen(pCorePath) > 0) {
        iFd2 = open("/proc/sys/kernel/core_pattern", O_RDWR | O_NDELAY | O_TRUNC, 0);
        if (iFd2 < 0) {
            printf("[error]open core_pattern fail!\n");
            return -1;
        }
        if (strlen(pCorePath) != write(iFd2, pCorePath, strlen(pCorePath))) {
            printf("[error]set core_pattern fail!\n");
            close(iFd2);
            return -1;
        }
        close(iFd2);
    }

    iRet = 0;
    printf("set core dump open success!\n");

    return 0;
}

void sys_init_coredump(void) {
    char stCorePid[32] = {0};
    char stCorePath[128] = {0};

    /* 为了唯一区分core文件,默认值: 通常是 0,表示不包含 PID;如果设置为 1,则在 core dump 文件名中包含 PID */
    strcpy(stCorePid, "1");

    memset(stCorePath, 0, sizeof(stCorePath));

    /* core保存目录 */
    strcpy(stCorePath, "./");

    if (access(stCorePath, F_OK) != 0) {
        printf("[error]Exception access stCorePath:%s is fail!\n", stCorePath);
        return;
    }

    /* core 文件名称: 包含 %e、%s、%p、%t 等占位符的字符串,
       这些占符会被替换为实际的程序名、信号编号、进程 PID 和unix时间戳等信息*/
    strcat(stCorePath, "core-%s-%e-%p-%t");

    if (0 != sys_open_coredump(stCorePid, stCorePath)) {
        printf("[error]========= gdb core dump open fail! =========\n\n");
    }
    else {
        printf("sys gdb core open success!!!\n");
    }
}

int main(int argc, const char *argv[]) {
    int *p = NULL;
    sys_init_coredump();
    printf("&p = %p, p = %p\n", &p, p);
    *p = 5;
    printf("*p=%d\n", *p);
    return 0;
}

这个我们在ubuntu中运行的话,需要加上sudo权限,否则会执行失败,运行后会生成对应的coredump文件。

2. coredump生成

这个看前面的关于coredump的笔记即可。

3. gdb指令

对于 core 文件的调试,其调用 GDB 调试器的指令为:

shell
gdb program.out core

在这里就是:

shell
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ sudo ./a.out                # <=== 1. sudo权限运行可执行程序
[sudo] sumu 的密码: 
set core dump open success!
sys gdb core open success!!!
&p = 0x7ffcc2eb10d0, p = (nil)
段错误                        # <=== 程序崩溃
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ chmod 777 core-11-a.out-12621-1777776566   
chmod: 正在更改 'core-11-a.out-12621-1777776566' 的权限: 不允许的操作
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ sudo chmod 777 core-11-a.out-12621-1777776566  # <=== 2.修改core文件访问权限,防止无法访问
 sumu@virtual-machine:~/workspace/c-learning/02-c-basic/21-debug [main  +1 ~0 -0 !]
$ gdb a.out core-11-a.out-12621-1777776566       # <=== 3.启动gdb调试
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...
[New LWP 12621]
Core was generated by `./a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  main (argc=1, argv=0x7ffcc2eb11d8) at 002-set-coresump.c:105 
105         *p = 5;        # <=== 4.可以看到这里直接给出了崩溃的行
(gdb)