本节介绍File System和MTD技术
一 FS
熟知的FS有ext2,3,4.但是这些都是针对磁盘设备的。而ES中一般的存储设备为Flash,由于Flash的特殊性:
- Flash存储按照Block size进行划分,而一个BLS一般有几十K。(对比磁盘的一个簇才512个字节)。这么大的BLS有什么坏处呢?很明显,擦除一个BL就需要花费很长的时间了。
- 另外,FLASH操作,一次必须针对一个BL,也就是如果我想修改一个字节的话,也必须先擦掉128K。这不是想死吗?
- FLASH每个BL擦写次数是有限的,例如2百万次?如果每次都操作同一个BL的话,很快这个BL就会死掉。
所以,针对FLASH的特性,整出了一个Journaling Flash File System(JFFS2,第二版)。它的特点就是:
- 损耗均衡,也就是每次擦写都不会停留在一个BL上。例如BL1上写一个文件,那么再来一个新文件的时候,FS就不会选择BL1了,而是选择BL2。这个技术叫weal leveling:损耗均衡
(apt-get install mtd-tools,下载不少好工具,另外,可见往flash设备写log,可能会导致flash短命喔)
一些伪文件系统:proc/sysfs,systool工具用来分析sysfs。
ES中常用的FS还有ramfs/rootfs/tmpfs,都是基于内存的FS。这个和前面讲的initramfs有什么关系?实际上这些是基于Initramfs的。
这里要解释一下几个比如容易混淆的东西:
- ram disk:这个是一个基于ram的disk,用来模拟block设备的,所以它是建立在一个虚拟的BLOCK设备上的,至于上面的FS用什么,无所谓.这样就引起效率等的问题。毕竟你的read等操作还是要传递到驱动的,并且如果该设备上构造了一个EXT2 FS的话,还得有相应的ext2-FS模块。麻烦..
- ramfs:这是一个基于内存的FS,而不是一个基于真实设备的FS。ramfs的特点就是用户可以死劲写内存,直到把系统内存搞空。
- 为了控制一下上面这个缺点,引出了tmpfs,可以指定tmpfs的size。(这个又好像和真实设备有点类似,因为真实设备的存储空间也是有限的)
- rootfs是一种特殊的ramfs或者tmpfs(看LK中是否启用了tmpfs),另外,rootfs是不能被umount的
下面介绍一下如何利用mount loop创建一个虚拟的基于文件的BLOCK设备。
- 先创建一个全是0的文件,利用dd命令:dd if=/dev/zero of=../fstest bs=1024 count=512 这个解释如下:从if中拷贝数据到of,每次拷贝字节为1024,拷贝总次数为512. 各位可用十六制工具看看,生成的文件里边全是0X00
- 在这个文件中创建FS,mkfs.ext2fs ../fstest。现在,FS就存在于这个文件了。其实FS也就是一些组织结构,例如superblock,inode等信息
- 如何把这个带有FS信息的文件挂载呢?其实也就是如何把这个文件当做一个Block device呢?利用mount的loop选项,mount -t ext2 -o loop fstest /tmp/。这样这个文件就被当做一个虚拟Block设备挂载到tmp了。
二 MTD技术
MTD全称是Memory Technology Device,内存技术设备?实际上是一个虚拟设备驱动层,类似Virtual File System。它提供标准API给那些操作Raw Flash的device driver。那么Flash device和普通的Block device的区别是什么呢?
- 普通的BLD只有两种操作:read和write
- 而Flash Device有三种操作:read,write和erase,另外,还需要一种wear leveling算法来做损耗均衡
这里要重点指出的是:
SD/MMC卡、CF(Compact Flash)卡、USB Flash等并不是MTD设备,因为这些设备中已经有一个内置的Flash Translation Layer,这个layer处理erase、wear leveling事情了(这个TL应该是固件中支持的)。所以这些设备直接当做普通的Block Device使用
(上面的描述还是没能说清楚MTD到底是怎么用的,以后会结合源码分析一下)
2.1 内核中启用MTD支持
这个很简单,make menuconfig的时候打开就行了,有好几个选项。
图1 LK中MTD支持的配置选项
其中:
-
MTC_CHAR和MTD_BLOCK用来支持CHAR模式和BLOCK模式读写MTD设备。这个和普通的char以及block设备意思一样
最后两个是在内核中设置一个MTD test驱动。8192K用来设置总大小,第二个128用来设置block size。就是在内核中搞了一个虚拟的Flash设备,用作测试
ES中又如何配置MTD及其相关的东西呢?
-
为Flash Disk设置分区(也可以整个Device就一个分区。BTW,我一直没彻底搞清楚分区到底是想干什么,这个可能是历史原因啦....)
-
设置Flash的类型以及location。Flash设备分为NOR和NAND,本节最后会简单介绍下二者的区别。
-
为Flash芯片选择合适的driver
为LK配置driver
下面先看看分区的设置
可对Flash分区,这里有一些稍微重要的内容:如何把Flash分区的信息传递给LK呢?有两种方法:
-
将整个device的分区情况存在一个BLock中,这样BootLoader启动的时候,根据这个BLock中的内容建立相应信息等。好像只有Red Boot支持。所以叫RedBoot Partition Table。另外,LK可以识别这种分区,通过CFI(Command Flash Interface)读取这个分区的信息。
-
Kernel Command Line Partitioning:通过Kernel启动的时候传入参数,不过KL必须配置一下。Command格式如下:
图2
再看看Driver的Mapping,也就是将MTD和对应的Flash Driver配对...
kernel/drivers/mtd/maps......,以后要分析
Flash芯片本身的Driver呢?
kernel/drivers/mtd/chips,目前比较流行的是CFI接口
三 一些参考资料和补充知识
http://www.linux-mtd.infradead.org/
MTD的本意是:
We're working on a generic Linux subsystem for memory devices, especially Flash devices.
The aim of the system is to make it simple to provide a driver for new hardware, by providing a generic interface between the hardware drivers and the upper layers of the system.
Hardware drivers need to know nothing about the storage formats used, such as FTL, FFS2, etc., but will only need to provide simple routines for read, write and erase. Presentation of the device's contents to the user in an appropriate form will be handled by the upper layers of the system.
MTD overview
MTD subsystem (stands for Memory Technology Devices) provides an abstraction layer for raw flash devices. It makes it possible to use the same API when working with different flash types and technologies, e.g. NAND, OneNAND, NOR, AG-AND, ECC'd NOR, etc.
MTD subsystem does not deal with block devices like MMC, eMMC, SD, CompactFlash, etc. These devices are not raw flashes but they have a Flash Translation layer inside, which makes them look like block devices. These devices are the subject of the Linux block subsystem, not MTD. Please, refer to this FAQ section for a short list of the main differences between block and MTD devices. And the raw flash vs. FTL devices UBIFS section discusses this in more details.
MTD subsystem has the following interfaces.
- MTD character devices - usually referred to as /dev/mtd0, /dev/mtd1, and so on. These character devices provide I/O access to the raw flash. They support a number of ioctl calls for erasing eraseblocks, marking them as bad or checking if an eraseblock is bad, getting information about MTD devices, etc. /dev/mtdx竟然是char device!!
- The sysfs interface is relatively newer and it provides full information about each MTD device in the system. This interface is easily extensible and developers are encouraged to use the sysfs interface instead of older ioctl or /proc/mtd interfaces, when possible.
- The /proc/mtd proc file system file provides general MTD information. This is a legacy interface and the sysfs interface provides more information.
MTD subsystem supports bare NAND flashes with software and hardware ECC, OneNAND flashes, CFI (Common Flash Interface) NOR flashes, and other flash types.
Additionally, MTD supports legacy FTL/NFTL "translation layers", M-Systems' DiskOnChip 2000 and Millennium chips, and PCMCIA flashes (pcmciamtd driver). But the corresponding drivers are very old and not maintained very much.
MTD Block Driver:
The mtdblock driver available in the MTD is an archaic tool which emulates block devices on top of MTD devices. It does not even have bad eraseblock handling, so it is not really usable with NAND flashes. And it works by caching a whole flash erase block in RAM, modifying it as requested, then erasing the whole block and writing back the modified. This means that mtdblock does not try to do any optimizations, and that you will lose lots of data in case of power cuts. And last, but not least, mtdblock does not do any wear-leveling.
Often people consider mtdblock as general FTL layer and try to use block-based file systems on top of bare flashes using mtdblock. This is wrong in most cases. In other words, please, do not use mtdblock unless you know exactly what you are doing.
There is also a read-only version of this driver which doesn't have the capacity to do the caching and erase/writeback, mainly for use with uCLinux where the extra RAM requirement was considered too large
These are the modules which provide interfaces that can be used directly from userspace. The user modules currently planned include:
-
Raw character access: A character device which allows direct access to the underlying memory. Useful for creating filesystems on the devices, before using some of the translation drivers below, or for raw storage on infrequently-changed flash, or RAM devices.
-
Raw block access :A block device driver which allows you to pretend that the flash is a normal device with sensible sector size. It actually works by caching a whole flash erase block in RAM, modifying it as requested, then erasing the whole block and writing back the modified data.
This allows you to use normal filesystems on flash parts. Obviously it's not particularly robust when you are writing to it - you lose a whole erase block's worth of data if your read/modify/erase/rewrite cycle actually goes read/modify/erase/poweroff. But for development, and for setting up filesystems which are actually going to be mounted read-only in production units, it should be fine. There is also a read-only version of this driver which doesn't have the capacity to do the caching and erase/writeback, mainly for use with uCLinux where the extra RAM requirement was considered too large. -
Flash Translation Layer (FTL):NFTL,Block device drivers which implement an FTL/NFTL filesystem on the underlying memory device. FTL is fully functional. NFTL is currently working for both reading and writing, but could probably do with some more field testing before being used on production systems.
-
Journalling Flash File System, v2:This provides a filesystem directly on the flash, rather than emulating a block device. For more information, see sources.redhat.com.
- MTD hardware device drivers
These provide physical access to memory devices, and are not used directly - they are accessed through the user modules above.
On-board memory:Many PC chipsets are incapable of correctly caching system memory above 64M or 512M. A driver exists which allows you to use this memory with the linux-mtd system. - PCMCIA devices:PCMCIA flash (not CompactFlash but real flash) cards are now supported by the pcmciamtd driver in CVS.
- Common Flash Interface (CFI) onboard NOR flash:This is a common solution and is well-tested and supported, most often using JFFS2 or cramfs file systems.
- Onboard NAND flash:NAND flash is rapidly overtaking NOR flash due to its larger size and lower cost; JFFS2 support for NAND flash is approaching production quality.
- M-Systems' DiskOnChip 2000 and Millennium:The DiskOnChip 2000, Millennium and Millennium Plus devices should be fully supported, using their native NFTL and INFTL 'translation layers'. Support for JFFS2 on DiskOnChip 2000 and Millennium is also operational although lacking proper support for bad block handling.
这里牵扯到NOR和NAND,那么二者有啥区别呢?
Beside the different silicon cell design, the most important difference between NAND and NOR Flash is the bus interface. NOR Flash is connected to a address / data bus direct like other memory devices as SRAM etc. NAND Flash uses a multiplexed I/O Interface with some additional control pins. NAND flash is a sequential access device appropriate for mass storage applications, while NOR flash is a random access device appropriate for code storage application. NOR Flash can be used for code storage and code execution. Code stored on NAND Flash can't be executed from there. It must be loaded into RAM memory and executed from there.
- NOR可以直接和CPU相连,就好像内存一样。NAND不可以,因为NAND还需要别的一些I/O控制接口。所以NAND更像磁盘,而NOR更像内存
- NOR比NAND贵,并且,NAND支持顺序读取,而NOR支持随机读取。
- 所以,NOR中可存储代码,这样CPU直接读取就在其中运行。NAND不可以(主要还是因为CPU取地址的时候不能直接找到NAND)