MIT6.828 lab1 exercise4~6

execise 4略简单,不做了

exercise 5

Basic knowledge from mit6.828 lab1 website :
(6.828上的解释)ELF binary: When you compile and link a C program such as the JOS kernel, the compiler transforms each C source (’.c’) file into an object (’.o’) file containing assembly language instructions encoded in the binary format expected by the hardware. The linker then combines all of the compiled object files into a single binary image such as obj/kern/kernel, which in this case is a binary in the ELF format, (“Executable and Linkable Format”).

An ELF binary starts with a fixed-length ELF header, followed by a variable-length program header listing each of the program sections to be loaded, which include:
.text: The program’s executable instructions.
.rodata: Read-only data, such as ASCII string constants produced by the C compiler. (We will not bother setting up the hardware to prohibit writing, however.)
.data: The data section holds the program’s initialized data, such as global variables declared with initializers like int x = 5;

When the linker computes the memory layout of a program, it reserves space for uninitialized global variables, such as int x;, in a section called .bss that immediately follows .data in memory. C requires that “uninitialized” global variables start with a value of zero. Thus there is no need to store contents for .bss in the ELF binary; instead, the linker records just the address and size of the .bss section. The loader or the program itself must arrange to zero the .bss section.

VMA (virtual address / link address):
The link address of a section is the memory address from which the section expects to execute. The linker encodes the link address in the binary in various ways, such as when the code needs the address of a global variable, with the result that a binary usually won’t work if it is executing from an address that it is not linked for.

LMA (physical address / load address):
The load address of a section is the memory address at which that section should be loaded into memory.

Typically, the link and load addresses are the same.
Unlike the boot loader, these two addresses aren’t the same: the kernel is telling the boot loader to load it into memory at a low address (1 megabyte), but it expects to execute from a high address.

The boot loader uses the ELF program headers to decide how to load the sections. The program headers specify which parts of the ELF object to load into memory and the destination address each should occupy.

The BIOS loads the boot sector into memory starting at address 0x7c00, so this is the boot sector’s load address. This is also where the boot sector executes from, so this is also its link address. We set the link address by passing -Ttext 0x7C00 to the linker in boot/Makefrag, so the linker will produce the correct memory addresses in the generated code.

可以使用objdump -x 命令看到所有headers,LOAD打头的就是需要被加载到内存中的。

exercise 5题目要求:
Trace through the first few instructions of the boot loader again and identify the first instruction that would “break” or otherwise do the wrong thing if you were to get the boot loader’s link address wrong.
Then change the link address in boot/Makefrag to something wrong, run make clean, recompile the lab with make, and trace into the boot loader again to see what happens.
Don’t forget to change the link address back and make clean again afterward!

根据题目提示,BIOS will load boot sector into memory which starts at 0x7c00,打开文件boot/Makefrag,里面有如下一段

$(OBJDIR)/boot/boot: $(BOOT_OBJS)
	@echo + ld boot/boot
	$(V)$(LD) $(LDFLAGS) -N -e start -Ttext 0x7C00 -o $@.out $^
	$(V)$(OBJDUMP) -S $@.out >$@.asm
	$(V)$(OBJCOPY) -S -O binary -j .text $@.out $@
	$(V)perl boot/sign.pl $(OBJDIR)/boot/boot

首先进行修改,将0x7c00改为比如说0x8900,接下来make clean,然后重新make qemu-gdb,再make gdb,在0x7c00处设置断点,这是因为BIOS会将boot sector默认加载到0x7c00处,然后stepi调试,发现最后instruction卡在了指令ljmp $PROT_MODE_CSEG, $protcseg处如下所示

(gdb) 
[   0:7c2a] => 0x7c2a:	mov    %eax,%cr0
0x00007c2a in ?? ()
(gdb) 
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x8932
0x00007c2d in ?? ()
(gdb) 
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x8932
0x00007c2d in ?? ()
(gdb) 
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x8932
0x00007c2d in ?? ()
(gdb) 
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x8932
0x00007c2d in ?? ()
(gdb) 

这里发现问题实际上出在了之前的一条指令:
0x7c1e: lgdtw -0x769c
对照未改变链接地址之前的同一条指令:
0x7c1e: lgdtw 0x7c64 //boot.S里对应lgdt gdtdsec
这条指令作用是将0x7c64处的6 byte data加载到GDTR中,这里对于实模式到保护模式的转换非常重要,lgdt指令解释链接如下:
https://www.fermimn.edu.it/linux/quarta/x86/lgdt.htm
打印看一下内部的值。

(gdb) x/6b 0x7c64
0x7c64:	0x17	0x00	0x4c	0x89	0x00	0x00

这也导致了后面ljmp的目标地址出现了错误,具体为什么的内部细节我还没有弄得很清楚,在这里留个疑问。

Exercise 6

we can use “objdump -f obj/kern/kernel” to see the entry point e_entry, which holds the link address of the entry point in the program: the memory address in the program’s text section at which the program should begin executing.

Exercise 题目要求
We can examine memory using GDB’s x command. The GDB manual has full details, but for now, it is enough to know that the command x/Nx ADDR prints N words of memory at ADDR. (Note that both 'x’s in the command are lowercase.) Warning: The size of a word is not a universal standard. In GNU assembly, a word is two bytes (the ‘w’ in xorw, which stands for word, means 2 bytes).
Reset the machine (exit QEMU/GDB and start them again). Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint? (You do not really need to use QEMU to answer this question. Just think.)

首先设置断点1:break *0x7c00,查看对应的内容是:

(gdb) x/8x 0x100000
0x100000:	0x00000000	0x00000000	0x00000000	0x00000000
0x100010:	0x00000000	0x00000000	0x00000000	0x00000000

接着设置第二个断点2:break *0x10000c ( first instruction of kernel ):

(gdb) x/8x 0x100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x0000b812	0x220f0011	0xc0200fd8

产生变化的原因在于boot loader将kernel加载到了内存当中。
输入命令objdump -x obj/kern/kernel,查看所有header:

Program Header:
    LOAD off    0x00001000 vaddr 0xf0100000 paddr 0x00100000 align 2**12
         filesz 0x0000716c memsz 0x0000716c flags r-x
    LOAD off    0x00009000 vaddr 0xf0108000 paddr 0x00108000 align 2**12
         filesz 0x0000a948 memsz 0x0000a948 flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
         filesz 0x00000000 memsz 0x00000000 flags rwx

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00001917  f0100000  00100000  00001000  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata       00000714  f0101920  00101920  00002920  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         00003889  f0102034  00102034  00003034  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .stabstr      000018af  f01058bd  001058bd  000068bd  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .data         0000a300  f0108000  00108000  00009000  2**12
                  CONTENTS, ALLOC, LOAD, DATA
  5 .bss          00000648  f0112300  00112300  00013300  2**5
                  CONTENTS, ALLOC, LOAD, DATA
  6 .comment      0000002d  00000000  00000000  00013948  2**0
                  CONTENTS, READONLY

所以储存在0x100000中的应该是.text段

上一篇:6.828 lab1


下一篇:lab1