http://blog.chinaunix.net/uid-24774106-id-3266816.html
很久以来,就想写一篇关于ext 家族文件系统的文章,源于我刚工作的时候,曾经一不小心rm -rf,误删除了很多文件,当时真想有个数据恢复软件能帮我把数据回复了。当然学习数据恢复,首先要学习文件系统。最近工作原因,好长时间没看学习Linux kernel 相关的东西,感觉面目可憎。扯远了,开始我们的ext2 文件系统的探索之旅。
- root@libin:~# dd if=/dev/zero of=bean bs=1K count=512000
- 记录了512000 0 的读入
- 记录了512000 0 的写出
- 524288000字节(524 MB)已复制,9.40989 秒,55.7 MB/秒
- root@libin:~# ll bean
- -rw-r--r-- 1 root root 524288000 2012-07-06 22:24 bean
- root@libin:~# ll -h bean
- -rw-r--r-- 1 root root 500M 2012-07-06 22:24 bean
- root@libin:~#
- root@libin:~#
- root@libin:~# losetup /dev/loop0 bean
- root@libin:~# cat /proc/partitions
- major minor #blocks name
- 7 0 512000 loop0
- 8 0 312571224 sda
- 8 1 49182966 sda1
- .......
- oot@libin:~# mke2fs /dev/loop0
- mke2fs 1.41.11 (14-Mar-2010)
- 文件系统标签=
- 操作系统:Linux
- 块大小=1024 (log=0)
- 分块大小=1024 (log=0)
- Stride=0 blocks, Stripe width=0 blocks
- 128016 inodes, 512000 blocks
- 25600 blocks (5.00%) reserved for the super user
- 第一个数据块=1
- Maximum filesystem blocks=67633152
- 63 block groups
- 8192 blocks per group, 8192 fragments per group
- 2032 inodes per group
- Superblock backups stored on blocks:
- 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
- 正在写入inode表: 完成
- Writing superblocks and filesystem accounting information: 完成
- This filesystem will be automatically checked every 24 mounts or
- 180 days, whichever comes first. Use tune2fs -c or -i to override.
但是这样还没完,我们还是不能访问我们新建的ext2文件系统,因为还没有挂载,我决定将loop 设备挂载在/mnt/bean 目录下。
- mkdir /mnt/bean
- mount -t ext2 /dev/loop0 /mnt/bean
- root@libin:/mnt/bean# mount
- .........
- /dev/loop0 on /mnt/bean type ext2 (rw)
- root@libin:/mnt/bean# ll
- 总用量 17
- drwxr-xr-x 3 root root 1024 2012-07-06 22:31 ./
- drwxr-xr-x 4 root root 4096 2012-07-06 22:32 ../
- drwx------ 2 root root 12288 2012-07-06 22:31 lost found/
经过我们的努力,我们终于创建好了我们的ext2文件系统。下面需要讲讲ext2文件系统的结构是什么样的了。
- Superblock backups stored on blocks:
- 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
这是格式化loop设备输出到终端的result信息,因为每个块组是8192个块(原因后面讲),所以第0个块组 ,第1块组,第3个块组 第5个块组,第7个块组,第9个块组,第25个块组,第27个块组,第49个块组存储有超级块。
- struct ext2_super_block {
- __u32 s_inodes_count;
- __u32 s_blocks_count;
- __u32 s_r_blocks_count;
- __u32 s_free_blocks_count;
- __u32 s_free_inodes_count;
- __u32 s_first_data_block;
- __u32 s_log_block_size;
- __u32 s_dummy3[7];
- unsigned char s_magic[2];
- __u16 s_state;
- ...
- }
下面我们通过debugfs来获取一下ext2的相关信息。
- root@libin:/mnt/bean# dumpe2fs /dev/loop0
- dumpe2fs 1.41.11 (14-Mar-2010)
- Filesystem volume name: <none>
- Last mounted on: <not available>
- Filesystem UUID: 3bff7535-6f39-4720-9b64-1dc8cf9fe61d
- Filesystem magic number: 0xEF53
- Filesystem revision #: 1 (dynamic)
- Filesystem features: ext_attr resize_inode dir_index filetype sparse_super
- Filesystem flags: signed_directory_hash
- Default mount options: (none)
- Filesystem state: not clean
- Errors behavior: Continue
- Filesystem OS type: Linux
- Inode count: 128016
- Block count: 512000
- Reserved block count: 25600
- Free blocks: 493526
- Free inodes: 128005
- First block: 1
- Block size: 1024
- Fragment size: 1024
- Reserved GDT blocks: 256
- Blocks per group: 8192
- Fragments per group: 8192
- Inodes per group: 2032
- Inode blocks per group: 254
- Filesystem created: Fri Jul 6 22:31:09 2012
- Last mount time: Fri Jul 6 22:33:28 2012
- Last write time: Fri Jul 6 22:33:28 2012
- Mount count: 1
- Maximum mount count: 24
- Last checked: Fri Jul 6 22:31:09 2012
- Check interval: 15552000 (6 months)
- Next check after: Wed Jan 2 22:31:09 2013
- Reserved blocks uid: 0 (user root)
- Reserved blocks gid: 0 (group root)
- First inode: 11
- Inode size: 128
- Default directory hash: half_md4
- Directory Hash Seed: 0140915d-91ae-43df-9d84-9536cedc0d2b
- Group 0: (Blocks 1-8192)
- 主 superblock at 1, Group descriptors at 2-3
- 保留的GDT块位于 4-259
- Block bitmap at 260 ( 259), Inode bitmap at 261 ( 260)
- Inode表位于 262-515 ( 261)
- 7663 free blocks, 2021 free inodes, 2 directories
- 可用块数: 530-8192
- 可用inode数: 12-2032
- ...
- Group 62: (Blocks 507905-511999)
- Block bitmap at 507905 (+0), Inode bitmap at 507906 (+1)
- Inode表位于 507907-508160 (+2)
- 3839 free blocks, 2032 free inodes, 0 directories
- 可用块数: 508161-511999
- 可用inode数: 125985-128016
OK ,我们拿到了这些信息,但是,我怎么证明debugfs拿到的信息是对的呢。只有一个办法,我们钻到超级块里面,根据超级块数据结构,获得超级块每个字段的值,听起来很刺激吧,OK,Just DO IT。
- root@libin:/mnt/bean# dd if=/dev/loop0 bs=1k count=261 |od -tx1 -Ax > /tmp/dump_hex
- 记录了261 0 的读入
- 记录了261 0 的写出
- 267264字节(267 kB)已复制,0.0393023 秒,6.8 MB/秒
- root@libin:/mnt/bean# vi /tmp/dump_hex
我将整个loop设备前面的261K字节读入了/tmp/dump_hex中。其中第0块是启动块,按下不提。第一块就是说super block。很激动,我们终于可以和传说中的超级块赤裸相见了。
- 000400 10 f4 01 00 00 d0 07 00 00 64 00 00 d6 87 07 00
- 000410 05 f4 01 00 01 00 00 00 00 00 00 00 00 00 00 00
- 000420 00 20 00 00 00 20 00 00 f0 07 00 00 5f cb f7 4f
- 000430 5f cb f7 4f 01 00 1a 00 53 ef 00 00 01 00 00 00
- 000440 25 cb f7 4f 00 4e ed 00 00 00 00 00 01 00 00 00
- 000450 00 00 00 00 0b 00 00 00 80 00 00 00 38 00 00 00
- 000460 02 00 00 00 01 00 00 00 5a 65 4b 92 fe 63 43 eb
- 000470 b6 86 3e f3 6e 44 19 af 00 00 00 00 00 00 00 00
- 000480 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- *
- 0004c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
- 0004d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- 0004e0 00 00 00 00 00 00 00 00 00 00 00 00 f9 6f 16 79
- 0004f0 b7 dc 4f 8a a1 a1 18 82 72 a7 d8 25 01 00 00 00
- 000500 00 00 00 00 00 00 00 00 25 cb f7 4f 00 00 00 00
- 000510 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- *
- 000560 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000570 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- *
- 000800 04 01 00 00 05 01 00 00 06 01 00 00 ef 1d e5 07
最左边一列是地址,16进制。000400=1K,换句话说,就是文件第1K个字节。000800 =2K,这就是我们朝思暮想的超级块啊。我很激动,所以把整个超级块都贴上了,幸好我不是靠字数来骗稿费的人,否则咱得被鄙视死。
- struct ext2_super_block {
- __u32 s_inodes_count;
- __u32 s_blocks_count;
- __u32 s_r_blocks_count;
- __u32 s_free_blocks_count;
- __u32 s_free_inodes_count;
- __u32 s_first_data_block;
- __u32 s_log_block_size;
- ...
- }
第一个字段叫s_inodes_count, 占四个字节。OK,我们看,从1K开始前四个字节是10 f4 01 00。我们知道有little-endian和big-endian。ext2设计者为了支持文件系统的可移动,规定磁盘上一律是little-endian,数据读入内存中时,kernel来负责把格式转成cpu的本机格式。
- struct ext2_group_desc
- {
- __u32 bg_block_bitmap; /* Blocks bitmap block */
- __u32 bg_inode_bitmap; /* Inodes bitmap block */
- __u32 bg_inode_table; /* Inodes table block */
- __u16 bg_free_blocks_count; /* Free blocks count */
- __u16 bg_free_inodes_count; /* Free inodes count */
- __u16 bg_used_dirs_count; /* Directories count */
- __u16 bg_flags;
- __u32 bg_exclude_bitmap_lo;/* Exclude bitmap for snapshots */
- __u16 bg_block_bitmap_csum_lo;/* crc32c(s_uuid+grp_num+bitmap)LSB */
- __u16 bg_inode_bitmap_csum_lo;/* crc32c(s_uuid+grp_num+bitmap)LSB */
- __u16 bg_itable_unused; /* Unused inodes count */
- __u16 bg_checksum; /* crc16(s_uuid+grouo_num+group_desc)*/
- };
- Group 1: (Blocks 8193-16384)
- 备份 superblock at 8193, Group descriptors at 8194-8195
- 保留的GDT块位于 8196-8451
- Block bitmap at 8452 (+259), Inode bitmap at 8453 (+260)
- Inode表位于 8454-8707 (+261)
- 7677 free blocks, 2032 free inodes, 0 directories
- 可用块数: 8708-16384
- 可用inode数: 2033-4064
- Group 2: (Blocks 16385-24576)
- Block bitmap at 16385 (+0), Inode bitmap at 16386 (+1)
- Inode表位于 16387-16640 (+2)
- 7936 free blocks, 2032 free inodes, 0 directories
- 可用块数: 16641-24576
- 可用inode数: 4065-6096
看上图,debugfs出来的信息,Group 2,并没有所谓的组描述符。而Group1,用8194和8195两个块来存储。OK,我们看下,里面存储的是什么东西。
- 000800 04 01 00 00 05 01 00 00 06 01 00 00 ef 1d e5 07
- 000810 02 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 块组0的组描述符
- -----------------------------------------------------------------------
- 000820 04 21 00 00 05 21 00 00 06 21 00 00 fd 1d f0 07
- 000830 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 块组1的组描述符
- -----------------------------------------------------------------------
- 000840 01 40 00 00 02 40 00 00 03 40 00 00 00 1f f0 07
- 000850 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 块组2的组描述符
- ------------------------------------------------------------------------
- 000860 04 61 00 00 05 61 00 00 06 61 00 00 fd 1d f0 07
- 000870 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000880 01 80 00 00 02 80 00 00 03 80 00 00 00 1f f0 07
- 000890 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 0008a0 04 a1 00 00 05 a1 00 00 06 a1 00 00 fd 1d f0 07
- 0008b0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 0008c0 01 c0 00 00 02 c0 00 00 03 c0 00 00 00 1f f0 07
- 0008d0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 0008e0 04 e1 00 00 05 e1 00 00 06 e1 00 00 fd 1d f0 07
- 0008f0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000900 01 00 01 00 02 00 01 00 03 00 01 00 00 1f f0 07
- 000910 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000fb0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000fc0 01 c0 07 00 02 c0 07 00 03 c0 07 00 ff 0e f0 07
- 000fd0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 块组62的组描述符
- -----------------------------------------------------------------------
- 000fe0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- * 没有块组63
- ----------------------------------------------------------------------
- 001000 04 20 00 00 04 60 00 00 04 a0 00 00 04 e0 00 00
04 01 00 00 转换成可读的十进制是0x104=259,表示数据位图位于第259块block。inode位图位于260,和debugfs出来的信息是一样的(不算启动块)。0x1def=7663个空闲数据块....
- Group 25: (Blocks 204801-212992)
- 备份 superblock at 204801, Group descriptors at 204802-204803
- 保留的GDT块位于 204804-205059
- Block bitmap at 205060 (+259), Inode bitmap at 205061 (+260)
- Inode表位于 205062-205315 (+261)
- 7677 free blocks, 2032 free inodes, 0 directories
- 可用块数: 205316-212992
- 可用inode数: 50801-52832
点击(此处)折叠或打开
- root@libin:/mnt/bean# dd if=/dev/loop0 bs=1k skip=204802 count=2|od -tx1 -Ax > /tmp/dump_hex_
- 记录了2+0 的读入
- 记录了2+0 的写出
- 2048字节(2.0 kB)已复制,0.000160205 秒,12.8 MB/秒
- root@libin:/mnt/bean# vi /tmp/dump_hex_
- 000000 04 01 00 00 05 01 00 00 06 01 00 00 ef 1d e5 07
- 000010 02 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000020 04 21 00 00 05 21 00 00 06 21 00 00 fd 1d f0 07
- 000030 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000040 01 40 00 00 02 40 00 00 03 40 00 00 00 1f f0 07
- 000050 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000060 04 61 00 00 05 61 00 00 06 61 00 00 fd 1d f0 07
- 000070 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 000080 01 80 00 00 02 80 00 00 03 80 00 00 00 1f f0 07
- 000090 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 0000a0 04 a1 00 00 05 a1 00 00 06 a1 00 00 fd 1d f0 07
- 0000b0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- ....
- 0007c0 01 c0 07 00 02 c0 07 00 03 c0 07 00 ff 0e f0 07
- 0007d0 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00
- 0007e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
- *
- 000800
- root@libin:/mnt/bean# cd /home
- root@libin:/home# umount /dev/loop0
- root@libin:/home# cd /mnt/bean
- root@libin:/mnt/bean# ll
- 总用量 8
- drwxr-xr-x 2 root root 4096 2012-07-06 22:32 ./
- drwxr-xr-x 4 root root 4096 2012-07-06 22:32 ../
- root@libin:/mnt/bean# mke2fs -b 4096 /dev/loop0
- mke2fs 1.41.11 (14-Mar-2010)
- 文件系统标签=
- 操作系统:Linux
- 块大小=4096 (log=2)
- 分块大小=4096 (log=2)
- Stride=0 blocks, Stripe width=0 blocks
- 128000 inodes, 128000 blocks
- 6400 blocks (5.00%) reserved for the super user
- 第一个数据块=0
- Maximum filesystem blocks=134217728
- 4 block groups
- 32768 blocks per group, 32768 fragments per group
- 32000 inodes per group
- Superblock backups stored on blocks:
- 32768, 98304
- 正在写入inode表: 完成
- Writing superblocks and filesystem accounting information: 完成
- This filesystem will be automatically checked every 39 mounts or
- 180 days, whichever comes first. Use tune2fs -c or -i to override