深刻理解linux内核调用栈、栈帧结构

2021-07-20 06:31:55

摘自：https://blog.csdn.net/koozxcv/article/details/49998237

我们知道，栈溢出通常是因为递归调用层次太深导致，那么为什么递归调用层次太深回导致栈溢出呢，解决这个问题

之前我们先看一下与函数调用有关的栈的基本概念：

1. 每一个线程拥有一个调用栈结构(call stack)，调用栈存放该线程的函数调用信息

2. 程序中每一个未完成运行的函数对应一个栈帧(stack frame)，或者一个更响亮的名字，过程的活动记录，栈帧

中保存函数局部变量、传递给被调函数参数等信息

3. 栈底对应高地址，栈顶对应低地址，栈由内存高地址向低地址生长

对于下面这段代码：

1 void f(int a) {
2     printf("%d\n", a);
3 }
4 void g() {
5     int a = 2;
6     f(a);
7 }

当程序运行到进入printf时，对应的调用栈(模拟)是这样的：

可以看到，正常的函数调用会使栈帧指针向下增长，而每个进程的调用栈大小都有一个限制，当调用层次过深导致栈帧

指针越过调用栈的下界时，就是导致栈溢出. 因此我们写递归调用的代码时千万要注意，尽量保证递归调用的层次不要

太深！！！再一个就是不要在栈上定义太大的数组！！！

1. 下界溢出

我权且称其为下界溢出，是因为这种溢出是说栈帧指针到达了栈地址空间的下界，上面已经分析了这种溢出，下面通过一

段具体的代码来看一下：

 1 #include <stdio.h>
 2 #include <string.h>
 3 
 4 void call()
 5 {
 6     int a[1024];
 7     printf("hello call! \n");
 8     call();
 9 }
10 int main(int argc, char *argv[]) {
11     call();
12 }

编译这段代码：

可以通过设置断点，在gdb中查看每一个调用栈帧指针的变化，这里给个最终结果

gdb ./a.out
(gdb) r

Program received signal SIGSEGV, Segmentation fault.

(gdb) bt
#0    call () at overflow.c:7
#1    0x08048426 in call () at overflow.c:8
#2    0x08048426 in call () at overflow.c:8
...
#36   0x08048426 in call () at overflow.c:8
#2032 0x08048426 in call () at overflow.c:8

(gdb) info frame 
Stack level 0, frame at 0xbf800520:
 eip = 0x8048415 in call (overflow.c:7); saved eip 0x8048426
 called by frame at 0xbf801540
 source language c.
 Arglist at 0xbf800518, args: 
 Locals at 0xbf800518, Previous frame‘s sp is 0xbf800520
 Saved registers:
  ebp at 0xbf800518, eip at 0xbf80051c

(gdb) info proc mappings 
process 5560
Mapped address spaces:

    Start Addr   End Addr       Size     Offset objfile
    0x08048000  0x08049000     0x1000        0x0 a.out
    0x08049000  0x0804a000     0x1000        0x0 a.out
    0x0804a000  0x0804b000     0x1000     0x1000 a.out
    ...
    0xb7fff000  0xb8000000     0x1000    0x20000 /lib/i386-linux-gnu/ld-2.15.so
    0xbf801000  0xc0000000   0x7ff000        0x0 [stack]

a. 通过bt可以打印调用栈，可以看出共发生了2033次调用后栈溢出

b. 通过info proc mappings可以看出进程栈的地址空间为0xbf801000～0xc0000000

c. 通过info frame可以看出当前栈帧的情况，可以看出，最后一次调用栈帧指针已经指向0xbf800520，前一帧

指向0xbf801540, 剩余栈空间已经不足容纳一个栈帧，导致访问非法地址空间，发生段错误

2. 上界溢出

下界溢出比较常见，而且实现起来很简单，只要递归层次足够深或者在函数内定义足够大的非静态数组就可以了，那么

如何实现上界溢出，方法肯定还是需要足够深的"调用"，只不过需要每次“调用”栈帧指针都向栈底增长，可是正常的函数

调用都是向下增长的啊. 怎么办呢，我们想一下函数调用过程，向栈中压入参数，返回地址，函数返回时弹出返回地址

方法就隐藏在这里，我们模拟函数调用，通过修改函数返回地址指向自身函数入口地址，那么每次函数返回时，都会弹

出返回地址，这样其实我们并没有调用函数，而是通过修改返回地址的方法模拟调用过程，因此不存在压入返回地址，但

是函数返回时会"以为"自己是通过正常的函数调用被调用的，会主动从栈中弹出返回地址，这样就绕过了规则，使得每次

"调用自身"都会将栈帧指针+4(32位系统). 分析到这，理论上可以实现栈指针向上增长了，光说不练假把式，上代码

 1 #include <stdio.h>
 2 #include <string.h>
 3 
 4 #define ADDRESS(a, f) *(int*)a = ((int)f);
 5 char t[] = { 0, 0, 0, 0, 0 };
 6 
 7 void call()
 8 {
 9     char c;
10     char *p = &c;
11     printf("hello call! \n");
12     strcpy(p + 17, t); /*覆盖返回地址*/
13 }
14 int main() {
15     int a = 2;
16     ADDRESS(t, call); /*将call地址存入t*/
17     call();
18 }

肯定有人问，你是怎么知道返回地址存放在哪的，那个+17是怎么得到的，答案很简单，反汇编

0804843c <call>:
 804843c:    55                       push   %ebp
 804843d:    89 e5                    mov    %esp,%ebp
 804843f:    83 ec 28                 sub    $0x28,%esp
 8048442:    8d 45 f3                 lea    -0xd(%ebp),%eax /*得到c的地址，看以看到c的地址是ebp - 13*/
 8048445:    89 45 f4                 mov    %eax,-0xc(%ebp)
 8048448:    c7 04 24 28 85 04 08     movl   $0x8048528,(%esp)
 804844f:    e8 cc fe ff ff           call   8048320 <puts@plt>
 8048454:    8b 45 f4                 mov    -0xc(%ebp),%eax
 8048457:    83 c0 11                 add    $0x11,%eax
 804845a:    c7 44 24 04 1c a0 04     movl   $0x804a01c,0x4(%esp)
 8048461:    08 
 8048462:    89 04 24                 mov    %eax,(%esp)
 8048465:    e8 a6 fe ff ff           call   8048310 <strcpy@plt>
 804846a:    c9                       leave  
 804846b:    c3                       ret

根据反汇编代码，可以得到当前栈帧：

由此分析返回地址等于 ebp + 4 = ebp - 13 + 17 = &c + 17

看下执行结果：

Program received signal SIGSEGV, Segmentation fault.
0xb7ea6ea7 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) bt
#0  0xb7ea6ea7 in ?? () from /lib/i386-linux-gnu/libc.so.6
#1  0x0804846a in call () at overflow.c:12
#2  0x0804843c in frame_dummy ()
Cannot access memory at address 0xc0000000

(gdb) info frame 
Stack level 0, frame at 0xbfffffd0:
 eip = 0xb7ea6ea7; saved eip 0x804846a
 called by frame at 0xc0000000
 Arglist at 0xbffffff8, args: 
 Locals at 0xbffffff8, Previous frame‘s sp is 0xbfffffd0
 Saved registers:
  eip at 0xbfffffcc

(gdb) info proc mappings 
process 6332
Mapped address spaces:

    Start Addr   End Addr       Size     Offset objfile
    0x08048000 0x08049000     0x1000        0x0 a.out
    0x08049000 0x0804a000     0x1000        0x0 a.out
    0x0804a000 0x0804b000     0x1000     0x1000 a.out
    ...
    0xb7fff000 0xb8000000     0x1000    0x20000 /lib/i386-linux-gnu/ld-2.15.so
    0xbffdf000 0xc0000000    0x21000        0x0 [stack]

最终栈帧指针到了0xc0000000，导致上溢，和我们分析的过程是吻合的

实验环境：

CPU指令集	x86
操作系统	Ubuntu 12.04
内核版本	Linux 3.2.0
gcc版本	gcc-4.7

深刻理解linux内核调用栈、栈帧结构

码农公寓

相关文章