description
用gdb debug linux kernel容易吗?其实要走到这步真的不容易啊,其实也难道是不难,就是要知道的东西太多了。用gdb debug linux kernel 可以有2中方式:UML和qemu方式,这里主要说qemu,从源码编译安装qemu很费劲。
准备环境
linux OS: Debian7.5-i386(当时最新的Wheezy,装在VMware10上,我用的在线安装,安装后以text方式跑起来,我的笔记本配置资源有限!)
root fs:Debian-Wheezy-x86-root_fs.bz2(之前下的,好像是Debian7.0的,不过没关系,可以更新,下载地址http://fs.devloop.org.uk/)
linux kernel source: linux-3.2.59.tar.xz(选择这个是和Debian7.5-i386的内核版本差不多。)
qemu:我用apt-get install的1.1.2,源码安装折腾。
/etc/qemu-ifup:配置一个ip,(这个在启动qemu的时候,如果带了net参数时会建立这个网卡。)
#!/bin/sh
/sbin/ifconfig $1 10.0.2.11
编译和debug工具这些少不了,肯定要有网络,想要什么直接apt-get install xxx 就OK了,方便多了!
编译kernel
下载解压缩就不说了,直接编译:
make defconfig
make menuconfig --> 进去设置一些debug选项,这里不说了!
make -j 8 bzImage --> 开始编译,等待漫长...
make modules_install INSTALL_MOD_PATH=../fs --> 编译安装kernel模块,可以不用。
这样kernel就编译完成了,这样就有了linux-3.2.59/arch/x86/boot/bzImage(用于qemu启动)和linux-3.2.59/vmlinux(用于gdb attach)文件。
去优化问题,kernel很多源码去优化是编译不过的,如果我们对某个文件感兴趣可以通过如下方式:
init/main.c --> 在init/Makefile添加: CFLAGS_main.o = -O0
net/socket.c --> 在net/Makefile添加: CFLAGS_socket.o = -O0
upgrade root fs 和install packet
mount -o loop ./Debian-Wheezy-x86-root_fs fs/ --> mount root fs 到fs目录,这样直接对fs访问来修改root fs。
chroot ./fs --> 加载root fs
mount -t proc /proc /proc --> 对新的root fs手动加载proc
现在用新的root fs还不能访问网络,需要修改nameserver,/etc/resolv.conf 这个改成之前root fs的内容就能联网了。
如果root fs有密码,并且不知道,这个可以用passwd -d来删除或者重新设置。
apt-get upgrade --> 这个不是必要,我只是为了防止安装软件包出现不必要的错误。
apt-get install gcc g++ make gdb openssh-server -y -->这个只是安装一些觉得可能用的软件包!
总之觉得chroot确实是一个很强大的东西,这个能构建一个新的linux release。
qemu running and gdb debug kernel
启动很简单,这里用qemu的stop on tcp::1234的这种gdbserver方式,cmd:
root@debian:~# qemu-system-i386 -kernel ./linux-3.2.59/arch/x86/boot/bzImage -append "console=ttyS0 rdinit=/bin/sh root=/dev/sda rw mem=256M" --boot c -nographic -hda ./Debian-Wheezy-x86-root_fs
-m 256 -k en-us -S -s -net nic -net tap
QEMU 1.1.2 monitor - type ‘help‘ for more information
(qemu) QEMU 1.1.2 monitor - type ‘help‘ for more information
(qemu)
(qemu) QEMU 1.1.2 monitor - type ‘help‘ for more information
(qemu)
这里可以不需要xwindow,带了参数-nographic,就要一般的远程终端下就能完成。
下一步就是启动gdb and target remote tcp::1234,cmd:
root@debian:~# gdb ./linux-3.2.59/vmlinuxGNU gdb (GDB) 7.4.1-debianCopyright (C) 2012 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "i486-linux-gnu".For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>...Reading symbols from /root/linux-3.2.59/vmlinux...done.(gdb) target remote tcp::1234Remote debugging using tcp::12340x0000fff0 in ?? ()(gdb) b start_kernelBreakpoint 1 at 0xc1881629: file init/main.c, line 469.(gdb) cContinuing.Breakpoint 1, start_kernel () at init/main.c:469469 {(gdb) n473 smp_setup_processor_id();(gdb) l468 asmlinkage void __init start_kernel(void)469 {470 char * command_line;471 extern const struct kernel_param __start___param[], __stop___param[];472473 smp_setup_processor_id();474475 /*476 * Need to run as early as possible, to initialize the477 * lockdep hash:(gdb)
(gdb) cContinuing.
看上面的gdb确实可以debug linux kernel,我们先跳过启动。我们来看qemu的启动日志,这里只有贴最后一点:
[....] Cleaning up temporary files.... okINIT: Entering runlevel: 2[info] Using makefile-style concurrent boot in runlevel 2.[....] Starting enhanced syslogd: rsyslogd. ok[....] Starting periodic command scheduler: cron. ok[....] Starting OpenBSD Secure Shell server: sshd. okDebian GNU/Linux 7 changeme ttyS0changeme login: rootPassword:Last login: Mon May 26 15:44:04 UTC 2014 on ttyS0Linux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686The programs included with the Debian GNU/Linux system are free software;the exact distribution terms for each program are described in theindividual files in /usr/share/doc/*/copyright.Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extentpermitted by applicable law.root@changeme:~#
root@changeme:~# uname -aLinux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686 GNU/Linux
看,这个是不是起来了,内核版本也是我们之前编译的,这个之前安装了gcc,来看看gcc使用,cmd:
root@changeme:~# cat hello.c#include <stdio.h>#include <stdlib.h>void main(){printf("hello world!\n");}root@changeme:~# gcc hello.croot@changeme:~# ls -l a.out-rwxr-xr-x 1 root root 4980 May 27 08:23 a.outroot@changeme:~# ./a.outhello world!root@changeme:~#
gcc使用正常。来看看网卡,cmd:
root@changeme:~# ifconfig
eth0 Link encap:Ethernet HWaddr 52:54:00:12:34:56inet6 addr: fe80::5054:ff:fe12:3456/64 Scope:LinkUP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1RX packets:0 errors:0 dropped:0 overruns:0 frame:0TX packets:17 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:1000RX bytes:0 (0.0 B) TX bytes:3330 (3.2 KiB)lo Link encap:Local Loopbackinet addr:127.0.0.1 Mask:255.0.0.0inet6 addr: ::1/128 Scope:HostUP LOOPBACK RUNNING MTU:16436 Metric:1RX packets:1 errors:0 dropped:0 overruns:0 frame:0TX packets:1 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:0RX bytes:112 (112.0 B) TX bytes:112 (112.0 B)
没有DHCD到IP,手动设置,cmd:
root@changeme:~# ifconfig eth0 10.0.2.15 netmask 255.255.255.0root@changeme:~# ifconfigeth0 Link encap:Ethernet HWaddr 52:54:00:12:34:56inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0inet6 addr: fe80::5054:ff:fe12:3456/64 Scope:LinkUP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1RX packets:0 errors:0 dropped:0 overruns:0 frame:0TX packets:18 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:1000RX bytes:0 (0.0 B) TX bytes:3672 (3.5 KiB)lo Link encap:Local Loopbackinet addr:127.0.0.1 Mask:255.0.0.0inet6 addr: ::1/128 Scope:HostUP LOOPBACK RUNNING MTU:16436 Metric:1RX packets:1 errors:0 dropped:0 overruns:0 frame:0TX packets:1 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:0RX bytes:112 (112.0 B) TX bytes:112 (112.0 B)
手动设置后,确实有ip了,在host主机上能通过ping通这个ip吗?我们来看看host主机上的IP:
root@debian:~# ifconfigeth0 Link encap:Ethernet HWaddr 00:0c:29:65:c4:5cinet addr:192.168.91.136 Bcast:192.168.91.255 Mask:255.255.255.0inet6 addr: fe80::20c:29ff:fe65:c45c/64 Scope:LinkUP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1RX packets:3586 errors:0 dropped:0 overruns:0 frame:0TX packets:5985 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:1000RX bytes:247944 (242.1 KiB) TX bytes:696252 (679.9 KiB)Interrupt:19 Base address:0x2000lo Link encap:Local Loopbackinet addr:127.0.0.1 Mask:255.0.0.0inet6 addr: ::1/128 Scope:HostUP LOOPBACK RUNNING MTU:16436 Metric:1RX packets:340 errors:0 dropped:0 overruns:0 frame:0TX packets:340 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:0RX bytes:22302 (21.7 KiB) TX bytes:22302 (21.7 KiB)tap0 Link encap:Ethernet HWaddr 26:2c:07:73:43:21inet addr:10.0.2.11 Bcast:10.255.255.255 Mask:255.0.0.0inet6 addr: fe80::242c:7ff:fe73:4321/64 Scope:LinkUP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1RX packets:27 errors:0 dropped:0 overruns:0 frame:0TX packets:34 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:500RX bytes:5850 (5.7 KiB) TX bytes:7246 (7.0 KiB)
host主机上多了一个tap0,ip 10.0.2.11和qemu跑的linux的ip 10.0.2.15在一个网段,那我在host主机上ping下,cmd:
root@debian:~# ping 10.0.2.15PING 10.0.2.15 (10.0.2.15) 56(84) bytes of data.64 bytes from 10.0.2.15: icmp_req=1 ttl=64 time=0.652 ms64 bytes from 10.0.2.15: icmp_req=2 ttl=64 time=1.98 ms64 bytes from 10.0.2.15: icmp_req=3 ttl=64 time=1.04 ms64 bytes from 10.0.2.15: icmp_req=4 ttl=64 time=0.993 ms^C--- 10.0.2.15 ping statistics ---4 packets transmitted, 4 received, 0% packet loss, time 3006msrtt min/avg/max/mdev = 0.652/1.168/1.982/0.494 msroot@debian:~#
perfect,ping通了,能ssh过去吗?下看qemu上的linux的sshd开启没,这个之前已经安装了,cmd:
root@changeme:~# netstat -apnActive Internet connections (servers and established)Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program nametcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 2591/sshdtcp6 0 0 :::22 :::* LISTEN 2591/sshdudp 0 0 0.0.0.0:13820 0.0.0.0:* 2397/dhclientudp 0 0 0.0.0.0:68 0.0.0.0:* 2397/dhclientudp6 0 0 :::30821 :::* 2397/dhclientActive UNIX domain sockets (servers and established)Proto RefCnt Flags Type State I-Node PID/Program name Pathunix 2 [ ACC ] SEQPACKET LISTENING 1680 973/udevd /run/udev/controlunix 4 [ ] DGRAM 3497 2526/rsyslogd /dev/logunix 2 [ ] DGRAM 3610 2397/dhclientunix 2 [ ] DGRAM 3575 2617/loginunix 3 [ ] DGRAM 1689 973/udevdunix 3 [ ] DGRAM 1688 973/udevdroot@changeme:~#
perfect,sshd起来了,我尝试ssh上去,cmd:
root@debian:~# ssh test@10.0.2.15test@10.0.2.15‘s password:Linux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686The programs included with the Debian GNU/Linux system are free software;the exact distribution terms for each program are described in theindividual files in /usr/share/doc/*/copyright.Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extentpermitted by applicable law.Last login: Tue May 27 08:17:31 2014 from 10.0.2.11Could not chdir to home directory /home/sam: No such file or directory$ su rootPassword:root@changeme:/# uname -aLinux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686 GNU/Linuxroot@changeme:/#
perfect,通过test user跳到root。
gdb debug tcp/ip kernel
上面的网络环境都建立好了,下面我们在qemu linux上启动server,host上起来client来连接qemu linux上的server程序,在网上随便找了2个example:
root@changeme:~# cat server1.c#include <sys/socket.h>#include <netinet/in.h>#include <arpa/inet.h>#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <errno.h>#include <string.h>#include <sys/types.h>#include <time.h>int main(int argc, char *argv[]){int listenfd = 0, connfd = 0;struct sockaddr_in serv_addr;char sendBuff[1025];time_t ticks;listenfd = socket(AF_INET, SOCK_STREAM, 0);memset(&serv_addr, ‘0‘, sizeof(serv_addr));memset(sendBuff, ‘0‘, sizeof(sendBuff));serv_addr.sin_family = AF_INET;serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);serv_addr.sin_port = htons(5000);bind(listenfd, (struct sockaddr*)&serv_addr, sizeof(serv_addr));listen(listenfd, 10);while(1){connfd = accept(listenfd, (struct sockaddr*)NULL, NULL);ticks = time(NULL);snprintf(sendBuff, sizeof(sendBuff), "%.24s\r\n", ctime(&ticks));write(connfd, sendBuff, strlen(sendBuff));close(connfd);sleep(1);}}root@changeme:~# gcc -o server1 server1.c
root@changeme:~# ls -l server1*
-rwxr-xr-x 1 root root 6562 May 27 08:27 server1
-rw-r--r-- 1 root root 1022 May 26 14:40 server1.croot@changeme:~# ./server1root@changeme:~#
root@debian:~# cat client1.c#include <sys/socket.h>#include <sys/types.h>#include <netinet/in.h>#include <netdb.h>#include <stdio.h>#include <string.h>#include <stdlib.h>#include <unistd.h>#include <errno.h>#include <arpa/inet.h>int main(int argc, char *argv[]){int sockfd = 0, n = 0;char recvBuff[1024];struct sockaddr_in serv_addr;if(argc != 2){printf("\n Usage: %s <ip of server> \n",argv[0]);return 1;}memset(recvBuff, ‘0‘,sizeof(recvBuff));if((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0){printf("\n Error : Could not create socket \n");return 1;}memset(&serv_addr, ‘0‘, sizeof(serv_addr));serv_addr.sin_family = AF_INET;serv_addr.sin_port = htons(5000);if(inet_pton(AF_INET, argv[1], &serv_addr.sin_addr)<=0){printf("\n inet_pton error occured\n");return 1;}if( connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr)) < 0){printf("\n Error : Connect Failed \n");return 1;}while ( (n = read(sockfd, recvBuff, sizeof(recvBuff)-1)) > 0){recvBuff[n] = 0;if(fputs(recvBuff, stdout) == EOF){printf("\n Error : Fputs error\n");}}if(n < 0){printf("\n Read error \n");}return 0;}root@debian:~# gcc -o client1 client1.croot@debian:~# ls -l client1*-rwxr-xr-x 1 root root 6245 May 27 04:26 client1-rw-r--r-- 1 root root 1351 May 27 04:26 client1.croot@debian:~# ./client1 10.0.2.15
Tue May 27 08:28:33 2014
如上看来是跑通了。gdb设置bind断点,然后启动server1,cmd:
(gdb) cContinuing.^CProgram received signal SIGINT, Interrupt.default_idle () at arch/x86/kernel/process.c:369369 current_thread_info()->status |= TS_POLLING;(gdb) bt#0 default_idle () at arch/x86/kernel/process.c:369#1 0xc1001a3f in cpu_idle () at arch/x86/kernel/process_32.c:116#2 0xc15ed776 in rest_init () at init/main.c:387#3 0xc1881920 in start_kernel () at init/main.c:641#4 0xc18810ac in i386_start_kernel () at arch/x86/kernel/head32.c:68#5 0x00000000 in ?? ()(gdb) b sys_bindBreakpoint 2 at 0xc14a7735: file net/socket.c, line 1431.(gdb) cContinuing.Breakpoint 2, sys_bind (fd=3, umyaddr=0xbf8b93c8, addrlen=16) at net/socket.c:14311431 sock = sockfd_lookup_light(fd, &err, &fput_needed);(gdb) l1426 {1427 struct socket *sock;1428 struct sockaddr_storage address;1429 int err, fput_needed;14301431 sock = sockfd_lookup_light(fd, &err, &fput_needed);1432 if (sock) {1433 err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);1434 if (err >= 0) {1435 err = security_socket_bind(sock,(gdb).........................(gdb) bt#0 inet_bind (sock=0xcf577300, uaddr=0xcfb55ed0, addr_len=16) at net/ipv4/af_inet.c:465#1 0xc14a77af in sys_bind (fd=3, umyaddr=0xbf8b93c8, addrlen=16) at net/socket.c:1439#2 0xc14a8b67 in sys_socketcall (call=2, args=0xbf8b8fb0) at net/socket.c:2421#3 <signal handler called>#4 0xb7687d22 in ?? ()#5 0xb75ca723 in ?? ()(gdb) l460 unsigned short snum;461 int chk_addr_ret;462 int err;463464 /* If the socket has its own bind function then use it. (RAW) */465 if (sk->sk_prot->bind) {466 err = sk->sk_prot->bind(sk, uaddr, addr_len);467 goto out;468 }469 err = -EINVAL;
就这样,算是debug tcp/ip stack起来了!