mit os lab1
物理地址空间
依据实验文档如下图
Exercise 1
...
Exercise 2
Exercise 2. Use GDB's si (Step Instruction) command to trace into the ROM BIOS for a few more instructions, and try to guess what it might be doing. You might want to look at Phil Storrs I/O Ports Description, as well as other materials on the 6.828 reference materials page. No need to figure out all the details - just the general idea of what the BIOS is doing first.
(gdb) si
[f000:e05b] 0xfe05b: cmpw $0xffb8,%cs:(%esi)
[f000:e062] 0xfe062: jne 0xd241d124
[f000:e066] 0xfe066: xor %edx,%edx
[f000:e068] 0xfe068: mov %edx,%ss
[f000:e06a] 0xfe06a: mov $0x7000,%sp
[f000:e070] 0xfe070: mov $0x132e,%dx
[f000:e076] 0xfe076: jmp 0x5576cf9e
这段初始化ss,sp,dx寄存器,最后执行jmp指令跳转到下一阶段?
(gdb)
[f000:cf9c] 0xfcf9c: cli
[f000:cf9d] 0xfcf9d: cld
[f000:cf9e] 0xfcf9e: mov %ax,%cx
[f000:cfa1] 0xfcfa1: mov $0x8f,%ax
[f000:cfa7] 0xfcfa7: out %al,$0x70
[f000:cfa9] 0xfcfa9: in $0x71,%al
[f000:cfab] 0xfcfab: in $0x92,%al
[f000:cfad] 0xfcfad: or $0x2,%al
[f000:cfaf] 0xfcfaf: out %al,$0x92
[f000:cfb1] 0xfcfb1: mov %cx,%ax
屏蔽中断,将标志寄存器置零。查资料得知0x70端口对应nmi使能端(不可屏蔽中断),0x71对应real time clock端口,0x92对应的是system device。我推测这里进行的是实验文档提到的初始化类似VGA display的设备。
[f000:cfb4] 0xfcfb4: lidtl %cs:(%esi)
[f000:cfba] 0xfcfba: lgdtl %cs:(%esi)
加载全局描述符和中断描述符。
Exercise 3
1. Take a look at the lab tools guide, especially the section on GDB commands. Even if you're familiar with GDB, this includes some esoteric GDB commands that are useful for OS work.Set a breakpoint at address 0x7c00, which is where the boot sector will be loaded. Continue execution until that breakpoint. Trace through the code in boot/boot.S, using the source code and the disassembly file obj/boot/boot.asm to keep track of where you are. Also use the x/i command in GDB to disassemble sequences of instructions in the boot loader, and compare the original boot loader source code with both the disassembly in obj/boot/boot.asm and GDB.Trace into bootmain() in boot/main.c, and then into readsect(). Identify the exact assembly instructions that correspond to each of the statements in readsect(). Trace through the rest of readsect() and back out into bootmain(), and identify the begin and end of the for loop that reads the remaining sectors of the kernel from the disk. Find out what code will run when the loop is finished, set a breakpoint there, and continue to that breakpoint. Then step through the remainder of the boot loader.
Be able to answer the following questions:
- At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
(gdb) si
[ 0:7c26] => 0x7c26: or $0x1,%ax
0x00007c26 in ?? ()
(gdb)
[ 0:7c2a] => 0x7c2a: mov %eax,%cr0
0x00007c2a in ?? ()
(gdb)
[ 0:7c2d] => 0x7c2d: ljmp $0xb866,$0x87c32
0x00007c2d in ?? ()
(gdb)
The target architecture is assumed to be i386
=> 0x7c32: mov $0x10,%ax
观察到在执行mov %eax, %cr0
后的ljmp指令后gdb提示“The target architecture is assumed to be i386”。且进入保护模式的方式即位向cr0寄存器置位PE bit。
- What is the last instruction of the boot loader executed, and what is the first instruction of the kernel it just loaded?
(gdb) c
Continuing.
=> 0x7d61: call *0x10018
Breakpoint 4, 0x00007d61 in ?? ()
注意到在0x7d61处结束循环,在此处加断点执行
(gdb) si
=> 0x10000c: movw $0x1234,0x472
0x0010000c in ?? ()
(gdb)
在执行上一步的call指令后,可以看到代码段已经执行到了0x00100000+再观察之前的物理地址空间图,此处为extended memory,再观察bootmain中的源码,这是之前bootloader的目标地址,故我认为此处call指令为bootloader的最后一条指令。
-
Where is the first instruction of the kernel?
上条第二个代码块。我认为此处即为kernal第一条指令。 -
How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?
读源代码可知主loop的开始结束地址分别为ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;
mainboot函数首先通过定义的宏ELFHDR
得到该ELF文件的首地址,再将该文件读入内存,再通过elf结构体的e_phoff和e_phnum得到要读的proghdr偏移和数量。
Exercise4
C指针和地址,略
Exercise5
Trace through the first few instructions of the boot loader again and identify the first instruction that would "break" or otherwise do the wrong thing if you were to get the boot loader's link address wrong. Then change the link address in boot/Makefrag to something wrong, run make clean, recompile the lab with make, and trace into the boot loader again to see what happens. Don't forget to change the link address back and make clean again afterward!
这里将Makefrag文件链接地址改成0x8c00,处理器在bootmain里陷入了死循环。我不太理解描述里“找出第一条会“break”的指令”是什么意思。这里标记一下等我彻底了解链接装载后再解决。
Exercise6
We can examine memory using GDB's x command. The GDB manual has full details, but for now, it is enough to know that the command x/Nx ADDR prints N words of memory at ADDR. (Note that both 'x's in the command are lowercase.) Warning: The size of a word is not a universal standard. In GNU assembly, a word is two bytes (the 'w' in xorw, which stands for word, means 2 bytes)。Reset the machine (exit QEMU/GDB and start them again). Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint? (You do not really need to use QEMU to answer this question. Just think.)
首先
[ 0:7c00] => 0x7c00: cli
Breakpoint 1, 0x00007c00 in ?? ()
(gdb) x/8x 0x0100000
0x100000: 0x00000000 0x00000000 0x00000000 0x00000000
0x100010: 0x00000000 0x00000000 0x00000000 0x00000000
这里是boot进入bootloader,内存显示全是0。
=> 0x7d47: cmp %esi,%ebx
Breakpoint 3, 0x00007d47 in ?? ()
(gdb) x/8x 0x100000
0x100000: 0x00000000 0x00000000 0x00000000 0x00000000
0x100010: 0x00000000 0x00000000 0x00000000 0x00000000
这是主循环第一次redseg之前,依然是零
(gdb) c
Continuing.
=> 0x7d47: cmp %esi,%ebx
Breakpoint 3, 0x00007d47 in ?? ()
(gdb) x/8x 0x100000
0x100000: 0x1badb002 0x00000000 0xe4524ffe 0x7205c766
0x100010: 0x34000004 0x0000b812 0x220f0011 0xc0200fd8
这是第二次redseg之前,可以看到此处内存已变化。再读elf文件
[re@restart lab]$ objdump -x obj/kern/kernel
obj/kern/kernel: file format elf32-i386
obj/kern/kernel
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0010000c
Program Header:
LOAD off 0x00001000 vaddr 0xf0100000 paddr 0x00100000 align 2**12
filesz 0x0000716c memsz 0x0000716c flags r-x
LOAD off 0x00009000 vaddr 0xf0108000 paddr 0x00108000 align 2**12
filesz 0x0000a948 memsz 0x0000a948 flags rw-
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
filesz 0x00000000 memsz 0x00000000 flags rwx
...
可以看到对应的Program Header指示的物理地址从0x100000开始。而redseg读完第一个proghdr对应的sect后该处内存变化,可以推断此处是因为执行redseg函数将对应数据从磁盘读入指定区域内存导致了目标存储数据变化。
Exercise7
Use QEMU and GDB to trace into the JOS kernel and stop at the movl %eax, %cr0. Examine memory at 0x00100000 and at 0xf0100000. Now, single step over that instruction using the stepi GDB command. Again, examine memory at 0x00100000 and at 0xf0100000. Make sure you understand what just happened.What is the first instruction after the new mapping is established that would fail to work properly if the mapping weren't in place? Comment out the movl %eax, %cr0 in kern/entry.S, trace into it, and see if you were right.
(gdb) x/w 0x00100000
0x100000: 0x1badb002
(gdb) x/w 0xf0100000
0xf0100000 <_start+4026531828>: Cannot access memory at address 0xf0100000
在执行movl %eax, %cr0
之前,0xf0100000内存无法访问,执行到下一指令后,0xf0100000 和0x00100000指向同一内存。
=> 0x100025: mov $0xf010002c,%eax
0x00100025 in ?? ()
(gdb)
=> 0x10002a: jmp *%eax
0x0010002a in ?? ()
(gdb)
=> 0xf010002c <relocated>: Error while running hook_stop:
Cannot access memory at address 0xf010002c
relocated () at kern/entry.S:74
74 movl $0x0,%ebp # nuke frame pointer
可以看到在注释掉该指令后处理器在0x10002c处出错,这里执行了一次跳转到0xf010002c的命令而这里在没有从虚拟地址映射到物理地址的处理时是无法访问的
Exercise8
We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form "%o". Find and fill in this code fragment
case 'o':
// Replace this with your code.
num = getuint(&ap, lflag);
base = 8;
goto number;
break;
q1.Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?
console.c里包括了console接口,serial接口以及kbd接口。
printf.c里putch函数的底层使用的是console.c里export的cputchar.
q2.Explain the following from console.c:
1 if (crt_pos >= CRT_SIZE) {
2 int i;
3 memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
4 for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
5 crt_buf[i] = 0x0700 | ' ';
6 crt_pos -= CRT_COLS;
7 }
当输出缓冲区占满时,显示使用memmove函数将缓存区CRT_COLS以后到缓冲区末尾的内存复制到缓冲区的开始,然后将之前所占部分清空。再将crt_pos这标号至于目前缓冲区空闲位置。
q3.
For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC's calling convention on the x86.Trace the execution of the following code step-by-step:
int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);
- In the call to cprintf(), to what does fmt point? To what does ap point?
fmt 指代“x %d, y %x, z %d\n”, ap指代x,y,z - List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.
实现类似有限状态机
vcprintf(fmt, x,y,z) cons_putc('x') cons_putc (' '),va_arg(x,y,z, int)...
q4Run the following code.
unsigned int i = 0x00646c72;
cprintf("H%x Wo%s", 57616, &i);
What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here's an ASCII table that maps bytes to characters.
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
He110 WorldK>
q5.In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?
cprintf("x=%d y=%d", 3);
我的推测为再执行va_arg后ap指向了3后面的内存,接下来进行验证。
(gdb) p ap
$11 = (va_list) 0xf010ff74
(gdb) c
Continuing.
=> 0xf0100db1 <vprintfmt+35>: mov %esi,%ebx
Breakpoint 3, vprintfmt (putch=0xf0100901 <putch>,
putdat=0xf010ff3c, fmt=0xf0101c0e, ap=0xf010ff78)
at lib/printfmt.c:92
92 while ((ch = *(unsigned char *) fmt++) != '%') {
猜测正确,在输入3后ap指向了va_list中3后面未定义的空间。
q6.Let's say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?
??我想不太明白,这只能改va_list操作的接口吧?va_arg和va_start的操作将地址向后移改为向前移。
Exercise 9. Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which "end" of this reserved area is the stack pointer initialized to point to?
=> 0xf010002f <relocated>: mov $0x0,%ebp
relocated () at kern/entry.S:74
74 movl $0x0,%ebp # nuke frame pointer
(gdb)
=> 0xf0100034 <relocated+5>: mov $0xf0110000,%esp
relocated () at kern/entry.S:77
77 movl $(bootstacktop),%esp
(gdb)
=> 0xf0100039 <relocated+10>: call 0xf010009d <i386_init>
80 call i386_init
(gdb)
kernel的栈初始化在entry.S文件中虚拟地址为0xf010002f附近,栈底位于0xf0110000。
Exercise.10
To become familiar with the C calling conventions on the x86, find the address of the test_backtrace function in obj/kern/kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?
Breakpoint 1, test_backtrace (x=3) at kern/init.c:13
13 {
(gdb) bt
#0 test_backtrace (x=3) at kern/init.c:13
#1 0xf0100069 in test_backtrace (x=4) at kern/init.c:16
#2 0xf0100069 in test_backtrace (x=5) at kern/init.c:16
#3 0xf01000ea in i386_init () at kern/init.c:39
#4 0xf010003e in relocated () at kern/entry.S:80
(gdb) x/w $ebp
0xf010ffb8: 0xf010ffd8
(gdb) x/w 0xf010ffd8
0xf010ffd8: 0xf010fff8
由上可以推测test_backtrace的栈帧为32字节,阅读kernal.asm文件
f0100040: 55 push %ebp
f0100041: 89 e5 mov %esp,%ebp
f0100043: 53 push %ebx
f0100044: 83 ec 14 sub $0x14,%esp
f0100047: 8b 5d 08 mov 0x8(%ebp),%ebx
cprintf("entering test_backtrace %d\n", x);
f010004a: 89 5c 24 04 mov %ebx,0x4(%esp)
f010004e: c7 04 24 e0 18 10 f0 movl $0xf01018e0,(%esp)
f0100055: e8 ed 08 00 00 call f0100947 <cprintf>
if (x > 0)
f010005a: 85 db test %ebx,%ebx
f010005c: 7e 0d jle f010006b <test_backtrace+0x2b>
test_backtrace(x-1);
f010005e: 8d 43 ff lea -0x1(%ebx),%eax
f0100061: 89 04 24 mov %eax,(%esp)
f0100064: e8 d7 ff ff ff call f0100040 <test_backtrace>
f0100069: eb 1c jmp f0100087 <test_backtrace+0x47>
这32字节分别对应ebp,ebp8字节栈上存储20字节,再加上call需要将eip压栈4字节。
Exercise 11.
Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn't. After you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.
uint32_t ebp = read_ebp();
uint32_t *ep = (uint32_t*)ebp;
while(*ep != 0)
{
uint32_t *epn = (uint32_t*)(*ep);
cprintf("ebp %x eip %x args ", ep, *(ep+1));
for(ep = ep + 2; ep < epn; ep++)
{
cprintf(" %08x", *ep);
}
cprintf("\n");
}
cprintf("ebp %x eip %x args ", ep, *(ep + 1));
for(int i = 0; i < 5;i++)
cprintf("%08x ", 0);
return 0;
做这题踩了个两个坑,卡了好久,刚开始时ebp输出的是*ep导致直接忽略了第一层的ebp,检查好久才想明白。
再就是这题的判定真的很奇怪,主要是这莫名其妙的5个arg.一开始我读源码时就看到call initi386是没有参数的,就自作聪明直接只输出了最后一层的ebp和eip.然后一直没通过后来才反应过来。
Exercise12
Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.
int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
uint32_t ebp = read_ebp();
uint32_t *ep = (uint32_t*)ebp;
struct Eipdebuginfo info;
while(*ep != 0)
{
uint32_t *epn = (uint32_t*)(*ep);
ebp = *(ep + 1);
cprintf("ebp %x eip %x args ", ep, *(ep+1));
debuginfo_eip(*(ep + 1), &info);
for(ep = ep + 2; ep < epn; ep++)
{
cprintf(" %08x", *ep);
}
cprintf("\n%s:%d: ", info.eip_file, info.eip_line);
cprintf("%.*s+%d\n", info.eip_fn_namelen, info.eip_fn_name, ebp - info.eip_fn_addr);
}
cprintf("ebp %x eip %x args ", ep, *(ep + 1));
for(int i = 0; i < 5;i++)
cprintf("%08x ", 0);
debuginfo_eip(*(ep + 1), &info);
cprintf("\n%s:%d: ", info.eip_file, info.eip_line);
cprintf("%.*s+%d\n", info.eip_fn_namelen, info.eip_fn_name, *(ep + 1) - info.eip_fn_addr);
return 0;
}
** Complete the implementation of debuginfo_eip by inserting the call to stab_binsearch to find the line number for an address.**
...
stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
if(lline <= rline)
{
info->eip_line = stabs[rline].n_desc;
}
else
{
return -1;
}
...
Add a backtrace command to the kernel monitor
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K> backtrace
ebp f010ff68 eip f01009fa args 00000001 f010ff80 00000000 00010094 f0112540 00000000 f010ffb8 f0100a6c f0100a2b f010ffac f0101ab7 f010ffe8 f0101a9c f010ffc4 00000000 00000000 00000000 00000005 f010ffd8 f0100a89 f0101ab7 f010ffe4 00000000 00010094 00010094 00000000
kern/monitor.c:145: monitor+258
ebp f010ffd8 eip f010010a args 00000000 00001aac 00000640 00000000 00000000 00000000
kern/init.c:42: i386_init+109
ebp f010fff8 eip f010003e args 00000000 00000000 00000000 00000000 00000000
kern/entry.S:83: <unknown>+0
实验结果
[re@restart lab]$ make grade
make clean
make[1]: Entering directory `/home/re/6.828/lab'
rm -rf obj .gdbinit jos.in qemu.log
make[1]: Leaving directory `/home/re/6.828/lab'
./grade-lab1
make[1]: Entering directory `/home/re/6.828/lab'
make[1]: Leaving directory `/home/re/6.828/lab'
make[1]: Entering directory `/home/re/6.828/lab'
+ as kern/entry.S
+ cc kern/entrypgdir.c
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
+ ld obj/kern/kernel
ld: warning: section `.bss' type changed to PROGBITS
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 380 bytes (max 510)
+ mk obj/kern/kernel.img
make[1]: Leaving directory `/home/re/6.828/lab'
running JOS: (5.3s)
printf: OK
backtrace count: OK
backtrace arguments: OK
backtrace symbols: OK
backtrace lines: OK
Score: 50/50
整了好几天才把Lab1做完,开学前整完够呛啊
完结
链接装载这方面我真是啥也不会,希望在三月结束之前有机会把程序员自我修养这本书看完。
整个lab1做完感觉最麻烦的就是环境配置了,这centos的qemu我研究了接近一天,再就是自己对函数栈的理解太差了,这次做完总算是明白了许多。
另外给自己定个小目标,两天内做完lab2.