在实验室环境中经常会遇到的问题就是进程down了,这种问题没有gdb这种调试工具的话很难定位到具体有问题的点。首先要用core文件调试,需要在进程down掉的时候生成core文件,下面我们首先用root做好相应的设置。
查看ulimit设置,
omu1:/opt/y00249743 # ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 194775
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 2048
cpu time (seconds, -t) unlimited
max user processes (-u) 194775
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
现在core file size是0,这样进程down掉的时候是不会生成core file的。设置ulimit
ulimit -c unlimited
查看修改后的设置
omu1:/opt/y00249743 # ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 194775
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 2048
cpu time (seconds, -t) unlimited
max user processes (-u) 194775
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited设置core文件生成的路径
omu1:/opt/y00249743 # echo core > /proc/sys/kernel/core_pattern
有问题的代码
#include <stdio.h> void core_here(char *ptr) { *ptr = 0; } void test() { char *ptr = NULL; core_here(ptr); } int main() { test(); return 0; }
编译代码
omu1:/opt/y00249743 # gcc -g test.c -o test
执行后将会在本目录下生成core文件
omu1:/opt/y00249743 # ./test
Segmentation fault (core dumped)用gdb调试core文件
omu1:/opt/y00249743 # gdb test core
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i586-suse-linux"...
Using host libthread_db library "/lib/libthread_db.so.1".
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `./test'.
Program terminated with signal 11, Segmentation fault.
#0 0x0804838a in core_here (ptr=0x0) at test.c:5
5 ptr = 0;
虽然这里一眼就可以看出是在是在test.c的第5行,ptr = 0执行导致进程down掉,我们在实验室环境中调试代码可能有时候需要看下一些变量的值以及调用栈,所以我们继续看下。查看调用栈信息
(gdb) bt
#0 0x0804838a in core_here (ptr=0x0) at test.c:5
#1 0x080483a7 in test () at test.c:11
#2 0x080483bc in main () at test.c:16
这里可以看到调用栈里有三个函数。查看具体变量值
(gdb) frame 0
#0 0x0804838a in core_here (ptr=0x0) at test.c:5
5 *ptr = 0;
确定代码块
(gdb) l
1 #include <stdio.h>
2
3 void core_here(char *ptr)
4 {
5 *ptr = 0;
6 }
7
8 void test()
9 {
10 char *ptr = NULL;
(gdb) p ptr
$3 = 0x0