linux内核中mtd架构分析

一. 引言

MTD(memory technology device内存技术设备)是用于访问memory设备(RAM、ROM、flash)的Linux的子系统。MTD的主要目的是为了使新的memory设备的驱动更加简单,为此它在硬件和上层之间提供了一个抽象的接口。MTD的所有源代码在/drivers/mtd子目录下。(参考 百度百科)

We're working on a generic Linux subsystem for memory devices, especially Flash devices.

The aim of the system is to make it simple to provide a driver for new hardware, by providing a generic interface between the hardware drivers and the upper layers of the system.

Hardware drivers need to know nothing about the storage formats used, such as FTL, FFS2, etc., but will only need to provide simple routines for readwrite and erase. Presentation of the device's contents to the user in an appropriate form will be handled by the upper layers of the system.

传统上, UNIX 只认识块设备和字符设备。字符设备是类似键盘或者鼠标的这类设备,你必须从它读取当前数据,但是不可以定位也没有大小。块设备有固定的大小并且可以定位, 它们恰好组织成许多字节的块,通常为512字节。

闪存既不满足块设备描述也不满足字符设备的描述。它们表现的类似块设备,但又有所不同。比如,块设备不区分写和擦除操作。因此,一种符合闪存特性的特殊设备类型诞生了, 就是 MTD 设备。所以 MTD 既不是块设备,也不是字符设备。

linux内核中mtd架构分析

注:MTD主要就是为NorFlash和NandFlash设计的,其余像接口映射、RAM、ROM等都是辅助功能。

二. 设计结构

linux内核中mtd架构分析linux内核中mtd架构分析

注:文件系统不基于MTD块设备设计

mtd提供了操作flash的框架,供上层以统一的接口调用,而具体的flash都是通过硬件控制器操作的,硬件控制器的驱动程序来成mtd具体接口的实现。而同种flash(nandflash或norflash)操作流程相同(flash的操作命令及时序都符合国际标准),从而同种类的flash又进一步提取出一个驱动框架(完成了大部分通用读写、擦除),如map_info、nand_chip。最终,flash驱动程序仅实现与本驱动器和flash密切相关的属性,从而填充map_info或nand_chip。

MTD设备通常可分为四层:设备节点、MTD设备层、MTD原始设备层和硬件驱动层,这四层的作用如下。
》》硬件驱动层:Flash硬件驱动层负责flash硬件设备的读写、擦除,主要针对NorFlash(mtd/chips)和NandFlash(mtd/nand)。
》》MTD原始设备层:MTD原始设备层有两部分组成,一部分是MTD原始设备的通用代码,另一部分是各个特定的Flash数据,例如分区。
》》MTD设备层:基于MTD原始设备,linux系统可以定义出MTD的块设备(主设备号31)和字符设备(设备号90),构成MTD设备层。MTD块设备定义了一个描述MTD块设备的结构mtdblk_dev,并声明了一个名为mtdblks的指针数组,这个数组中每一个mtdblk_dev和mtd_table中每一个mtd_info—对应。
》》设备节点:通过mknod在/dev子目录下建立MTD字符设备节点(/dev/mtd#)和MTD块设备节点(/dev/mtdblock#),用户通过访问此设备节点即可访问MTD字符设备和块设备。

内核启动后,通过mount 命令可以将flash中的其余分区作为文件系统挂载到mountpoint上。
注:文件系统基于mtd原始设备层,直接调用mtd相关接口读写。向上提供接口给VFS,VFS调用设备节点或直接文件系统自身完成。

linux内核中mtd架构分析

三. 应用

仅供参考,摘自mtd website。

The MTD system is divided into two types of module: "users" and "drivers".

Drivers are the modules which provide raw read/write/erase access to physical memory devices.

Users are the modules which use MTD drivers and provide a higher-level interface to user-space.

We currently have four 'user' modules available: FTL, NFTL, JFFS and MTDBLOCK. FTL and NFTL both provide a pseudo-block device on which a 'normal' filesystem is placed. JFFS is a filesystem which runs directly on the flash, and MTDBLOCK performs no translation - just provides a block device interface directly to the underlying MTD driver.

Just because I use the word 'module', it doesn't mean that these have to be loadable modules. You can link them statically into your kernel.

Writing a driver module

Instructions for writing a driver are very simple:

  • Allocate and populate a struct mtd_info with information about your device, and pointers to your access routines.
  • Register it by calling add_mtd_device

Oh yes - you have to actually write the access routines too, which have to conform to the rules.

Writing a user module

This is only slightly more complex:

  • Write a pair of notifier add and remove functions, which will be called whenever a driver is added to, or removed from, the system, respectively.
  • Register them by calling register_mtd_notifier

This ought to call your notifier function immediately for all drivers which are already present in the system. But it doesn't yet. Currently, drivers scan through callingget_mtd_device() to find previously-loaded drivers. This is bad and will be fixed soon.

  1. 1.  mtd user modules

These are the modules which provide interfaces that can be used directly from userspace. The user modules currently planned include:

  • Raw character access:
    A character device which allows direct access to the underlying memory. Useful
    for creating filesystems on the devices, before using some of the translation
    drivers below, or for raw storage on infrequently-changed flash, or RAM
    devices.
  • Raw block access
    A block device driver which allows you to pretend that the flash is a normal
    device with sensible sector size. It actually works by caching a whole flash
    erase block in RAM, modifying it as requested, then erasing the whole block and
    writing back the modified data.
    This allows you to use normal filesystems on flash parts. Obviously it's not
    particularly robust when you are writing to it - you lose a whole erase block's
    worth of data if your read/modify/erase/rewrite cycle actually goes
    read/modify/erase/poweroff. But for development, and for setting up filesystems
    which are actually going to be mounted read-only in production units, it should
    be fine. 
    There is also a read-only version of this driver which doesn't have the
    capacity to do the caching and erase/writeback, mainly for use with uCLinux
    where the extra RAM requirement was considered too large.
  • Flash Translation Layer (FTL)
  • NFTL
    Block device drivers which implement an FTL/NFTL filesystem on the underlying
    memory device. FTL is fully functional. NFTL is currently working for both
    reading and writing, but could probably do with some more field testing before
    being used on production systems.
  • Journalling Flash File System, v2
    This provides a filesystem directly on the flash, rather than emulating a
    block device. For more information, see sources.redhat.com.
  1. 2.  mtd hardware device drivers

These provide
physical access to memory devices, and are not used directly - they are
accessed through the user modules above.

  • On-board memory
    Many PC chipsets are incapable of correctly caching system memory above 64M or
    512M. A driver exists which allows you to use this memory with the linux-mtd
    system.
  • PCMCIA devices
    PCMCIA flash (not CompactFlash but real flash) cards are now
    supported by the pcmciamtd driver in CVS.
  • Common Flash Interface (CFI) onboard NOR flash
    This is a common solution and is well-tested and supported, most often using
    JFFS2 or cramfs file systems.
  • Onboard NAND flash
    NAND flash is rapidly overtaking NOR flash due to its larger size and lower cost;
    JFFS2 support for NAND flash is approaching production quality.
  • M-Systems' DiskOnChip 2000 and Millennium
    The DiskOnChip 2000, Millennium and Millennium Plus devices should be fully
    supported, using their native NFTL and INFTL 'translation layers'. Support for
    JFFS2 on DiskOnChip 2000 and Millennium is also operational although lacking
    proper support for bad block handling.
  • CompactFlash - http://www.compactflash.org/
    CompactFlash emulates an IDE disk, either through the PCMCIA-ATA standard, or
    by connecting directly to an IDE interface. 
    As such, it has no business being on this page, as to the best of my knowledge
    it doesn't have any alternative method of accessing the flash - you have to
    use the IDE emulation - I mention it here for completeness.

四. 目录介绍

mtd所有源码都在driver/mtd下,下截图基于2.6.30内核,3.2内核稍有不同。

linux内核中mtd架构分析

通过查阅Kconfig和Makefile可大致了解相关内容:

#

# Makefile for the
memory technology device drivers.

#

# Core
functionality.

obj-$(CONFIG_MTD)       += mtd.o

mtd-y               := mtdcore.o mtdsuper.o mtdbdi.o

mtd-$(CONFIG_MTD_PARTITIONS)    += mtdpart.o

obj-$(CONFIG_MTD_CONCAT)    += mtdconcat.o

obj-$(CONFIG_MTD_REDBOOT_PARTS)
+= redboot.o

obj-$(CONFIG_MTD_CMDLINE_PARTS)
+= cmdlinepart.o

obj-$(CONFIG_MTD_AFS_PARTS)
+= afs.o

obj-$(CONFIG_MTD_AR7_PARTS)
+= ar7part.o

obj-$(CONFIG_MTD_OF_PARTS)      += ofpart.o

# 'Users' - code which presents functionality to userspace.

obj-$(CONFIG_MTD_CHAR)      += mtdchar.o

obj-$(CONFIG_MTD_BLKDEVS)   += mtd_blkdevs.o

obj-$(CONFIG_MTD_BLOCK)     += mtdblock.o

obj-$(CONFIG_MTD_BLOCK_RO)  += mtdblock_ro.o

obj-$(CONFIG_FTL)       += ftl.o

obj-$(CONFIG_NFTL)      += nftl.o

obj-$(CONFIG_INFTL)     += inftl.o

obj-$(CONFIG_RFD_FTL)       += rfd_ftl.o

obj-$(CONFIG_SSFDC)     += ssfdc.o

obj-$(CONFIG_MTD_OOPS)      += mtdoops.o

nftl-objs       := nftlcore.o nftlmount.o

inftl-objs      := inftlcore.o inftlmount.o

obj-y       += chips/ lpddr/ maps/ devices/ nand/
onenand/ tests/

obj-$(CONFIG_MTD_UBI)       += ubi/

mtd包括核心文件mtdcore.c、mtdsuper.c、mtdbdi.c,分区文件mtdpart.c,设备层文件mtdchar.c、mtd_blkdevs.c、mtdblock.c。

mtd_blkdevs:common interface to block layer for MTD ‘translation layers’

mtd_block:Caching block device access to MTD devices, select MTD_BLKDEVS

目录包括:

chips:Norflash相关支持,cfi/jedec接口通用驱动。

lpddr:低功耗ddr设备,若不是ddr采用mtd管理,不用理会。

maps:存放对各种使用NorFlash开发板的flash地址划分。这个目录不包含NandFlash的信息。

devices:自包含mtd设备接口,设备自己向系统注册mtd设备(add_mtd_device())。具体可参考其中一类设备如dataflash,代码at91_dataflash.c。此类文件直接包含硬件驱动程序和原始设备接口,设备层(char/block)直接使用即可。

nand:Nandflash相关支持。

onenand:onenand
flash相关支持

test:测试相关文件

ubi:ubifs文件系统支持

五. 数据结构

1. mtd_info

include/linux/mtd/mtd.h

mtd_info是表示MTD原始设备的结构体,每个分区也被认为是一个mtd_info,例如,如果有两个MTD原始设备,而每个上有3个分区,在系统中就将共有6个mtd_info结构体,这些mtd_info的指针被存放在名为mtd_table的数组里。

mtd_info中的read()/write()/read_oob()/write_oob()是MTD设备驱动要实现的主要函数。

struct mtd_info {

u_char type;     // 内存技术的类型

uint32_t flags;  // 标志位

uint64_t size;   // Total size of the MTD 、mtd 设备的大小

/* "Major" erase size for the device. Na茂ve users may take this

* to be the only erase size available, or may use the more detailed

* information below if they desire

*/

uint32_t erasesize;    // 主要的擦除块大小 erase size of main block

/* Minimal writable flash unit size. In case of NOR flash it is 1 (even

* though individual bits can be cleared), in case of NAND flash it is

* one NAND page (or half, or one-fourths of it), in case of ECC-ed NOR

* it is of ECC block size, etc. It is illegal to have writesize = 0.

* Any driver registering a struct mtd_info must ensure a writesize of

* 1 or larger.

*/

uint32_t writesize;           // 最小的可写单元的字节数

uint32_t oobsize;   // Amount of OOB data per block (e.g. 16) OOB 字节数

uint32_t oobavail;  // Available OOB bytes per block   可用OBB 字节数

/*

* If erasesize is a power of 2 then the shift is stored in

* erasesize_shift otherwise erasesize_shift is zero. Ditto writesize.

*/

unsigned int erasesize_shift;

unsigned int writesize_shift;

/* Masks based on erasesize_shift and writesize_shift */

unsigned int erasesize_mask;

unsigned int writesize_mask;

// Kernel-only stuff starts here.

constchar *name;

int index;

/* ecc layout structure pointer - read only ! */

struct nand_ecclayout *ecclayout;  // ECC 布局结构体指针

/* Data for variable erase regions. If numeraseregions is zero,

* it means that the whole device has erasesize as given above.

*/

int numeraseregions;              // 不同的erasesize 的区域   数目通常是1

struct mtd_erase_region_info *eraseregions;

/*

* Erase is an asynchronous operation.  Device drivers are supposed

* to call instr->callback() whenever the operation completes, even

* if it completes with a failure.

* Callers are supposed to pass a callback function and wait for it

* to be called before writing to the block.

*/

int (*erase) (struct mtd_info *mtd, struct erase_info *instr);

/* This stuff for eXecute-In-Place */

/* phys is optional and may be set to NULL */

int (*point) (struct mtd_info *mtd, loff_t from, size_t len,            // 针对 eXecute-In- Place

size_t *retlen, void **virt, resource_size_t *phys);

/* We probably shouldn't allow XIP if the unpoint isn't a NULL */

void (*unpoint) (struct mtd_info *mtd, loff_t from, size_t len);        // 如果unpoint 为空,不允许 XIP

/* Allow NOMMU mmap() to directly map the device (if not NULL)

* - return the address to which the offset maps

* - return -ENOSYS to indicate refusal to do the mapping

*/

unsigned long (*get_unmapped_area) (struct mtd_info *mtd,

unsigned long len,

unsigned long offset,

unsigned long flags);

/* Backing device capabilities for this device

* - provides mmap capabilities

*/

struct backing_dev_info *backing_dev_info;

int (*read) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);        // 读 flash

int (*write) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf);   // 写 flash

/* In blackbox flight recorder like scenarios we want to make successful

writes in interrupt context. panic_write() is only intended to be

called when its known the kernel is about to panic and we need the

write to succeed. Since the kernel is not going to be running for much

longer, this function can break locks and delay to ensure the write

succeeds (but not sleep). */

int (*panic_write) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf);   // Kernel panic 时序读写

int (*read_oob) (struct mtd_info *mtd, loff_t from,           // 读 out-of-band

struct mtd_oob_ops *ops);

int (*write_oob) (struct mtd_info *mtd, loff_t to,            // 写 out-of-band

struct mtd_oob_ops *ops);

/*

* Methods to access the protection register area, present in some

* flash devices. The user data is one time programmable but the

* factory data is read only.

*/

int (*get_fact_prot_info) (struct mtd_info *mtd, struct otp_info *buf, size_t len);

int (*read_fact_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);

int (*get_user_prot_info) (struct mtd_info *mtd, struct otp_info *buf, size_t len);

int (*read_user_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);

int (*write_user_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);

int (*lock_user_prot_reg) (struct mtd_info *mtd, loff_t from, size_t len);

/* kvec-based read/write methods.

NB: The 'count' parameter is the number of _vectors_, each of

which contains an (ofs, len) tuple.

*/

int (*writev) (struct mtd_info *mtd, conststruct kvec *vecs, unsigned long count, loff_t to, size_t *retlen); // iovec-based 读写函数

/* Sync */

void (*sync) (struct mtd_info *mtd);                             // Sync

/* Chip-supported device locking */

int (*lock) (struct mtd_info *mtd, loff_t ofs, uint64_t len);    // 设备锁

int (*unlock) (struct mtd_info *mtd, loff_t ofs, uint64_t len);

int (*is_locked) (struct mtd_info *mtd, loff_t ofs, uint64_t len);

/* Power Management functions */

int (*suspend) (struct mtd_info *mtd);                          // 电源管理函数

void (*resume) (struct mtd_info *mtd);

/* Bad block management functions */

int (*block_isbad) (struct mtd_info *mtd, loff_t ofs);          // 坏块管理函数

int (*block_markbad) (struct mtd_info *mtd, loff_t ofs);

struct notifier_block reboot_notifier;  /* default mode before reboot */

/* ECC status information */

struct mtd_ecc_stats ecc_stats;

/* Subpage shift (NAND) */

int subpage_sft;

void *priv;                                                   // 私有函数

struct module *owner;

struct device dev;

int usecount;

/* If the driver is something smart, like UBI, it may need to maintain

* its own reference counting. The below functions are only for driver.

* The driver may register its callbacks. These callbacks are not

* supposed to be called by MTD users */

int (*get_device) (struct mtd_info *mtd);

void (*put_device) (struct mtd_info *mtd);

};

flash驱动中使用如下两个函数注册和注销MTD设备:

int
add_mtd_device(struct mtd_info *mtd);

int
del_mtd_device(struct mtd_info *mtd);

在Nor和Nand的驱动代码中几乎看不到mtd_info的成员函数(也即这些成员函数对于flash驱动是透明的),这是因为linux在MTD的下层实现了针对Nor和nand的通用mtd_info成员函数。

2. mtd_table

存储设备目前拥有的mtd_info。

driver/mtd/mtdcore.c

  struct mtd_info
*mtd_table[MAX_MTD_DEVICES];

3. mtd_part mtd分区结构体

drivers/mtd/mtdpart.c。

mtd_part用于描述分区,其mtd_info成员用于描述本分区,它会被加入到mtd_tabe中,其大部分成员由其主分区mtd_part->master决定,各种函数也指向主分区的相应函数,而主分区(其大小涵盖所有分区)则不作为一个MTD原始设备加入mtd_table。

struct mtd_part {

struct mtd_info mtd;      // 分区的信息(大部分由其 master 决定)

struct mtd_info *master;  // 该分区的主分区

uint64_t offset;          // 分区的偏移地址

struct list_head list;    // 分区号

};

/*

* Given a pointer to the MTD object in the
mtd_part structure, we can retrieve

* the pointer to that structure with this
macro.

*/

#define
PART(x)  ((struct mtd_part *)(x))

4. mtd_partition

mtd_partition会在MTD原始设备层调用add_mtd_partitions()时传递分区信息用。

/*

* Partition definition structure:

*

* An array of struct partition is passed along
with a MTD object to

* mtd_device_register() to create them.

*

* For each partition, these fields are
available:

* name: string that will be used to label the
partition's MTD device.

* size: the partition size; if defined as
MTDPART_SIZ_FULL, the partition

*  will
extend to the end of the master MTD device.

* offset: absolute starting position within
the master MTD device; if


defined as MTDPART_OFS_APPEND, the partition will start where the


previous one ended; if MTDPART_OFS_NXTBLK, at the next erase block;

*  if
MTDPART_OFS_RETAIN, consume as much as possible, leaving size

*  after
the end of partition.

* mask_flags: contains flags that have to be
masked (removed) from the


master MTD flag set for the corresponding MTD partition.

*  For
example, to force a read-only partition, simply adding

*  MTD_WRITEABLE
to the mask_flags will do the trick.

*

* Note: writeable partitions require their
size and offset be

* erasesize aligned (e.g. use
MTDPART_OFS_NEXTBLK).

*/

struct mtd_partition {

char *name;         /* identifier string */

uint64_t size;          /* partition size */

uint64_t offset;        /* offset within the master MTD space
*/

uint32_t mask_flags;        /* master MTD flags to mask out for
this partition */

struct nand_ecclayout *ecclayout;   /* out of band layout for this partition
(NAND only) */

  };

flash驱动中使用如下两个函数注册和注销分区:

int
add_mtd_partitions(struct mtd_info *master, struct mtd_partition *parts, int
nbparts);

int
del_mtd_partitions(struct mtd_info *master);

5. mtd_notifier

struct mtd_notifier {

void
(*add)(struct mtd_info *mtd);

void
(*remove)(struct mtd_info *mtd);

struct
list_head list;

};

MTD通知器,加入/删除MTD设备和原始设备时调用的函数,在设备层,当MTD字符设备或块设备注册时,如果定义了CONFIG_DEVFS_FS,则会将一个mtd_notifier加入MTD原始设备层的mtd_notifiers链表。其中的函数会在两种情况下被调用,一是加入/删除新的MTD字符/块设备时,此时调用该MTD字符/块设备的notifier对下层所有的MTD原始设备操作一遍,二是加入/删除新的MTD原始设备时,此时调用所有的notifier对该原始设备执行一遍。

六. 函数

1. mtd设备层-字符设备mtdchar.c

主要实现对字符设备的支持,完成struct
file_operations mtd_fops注册。

static
const struct file_operations mtd_fops = {

.owner     
= THIS_MODULE,

.llseek    
= mtd_lseek,

.read      
= mtd_read,

.write     
= mtd_write,

.ioctl     
= mtd_ioctl,

.open      
= mtd_open,

.release   
= mtd_close,

.mmap      
= mtd_mmap,

#ifndef CONFIG_MMU

.get_unmapped_area = mtd_get_unmapped_area,

#endif

};

函数操作的对象为mtd_info,底层调用了mtd_info的读写擦除操作。

2. mtd设备层-块设备mtd_blkdevs.c、mtdblock.c

一个提供对传输层支持,一个提供对缓冲(cache)支持。

static struct
mtd_blktrans_ops mtdblock_tr = {

.name      
= "mtdblock",

.major     
= 31,

.part_bits 
= 0,

.blksize   
= 512,

.open      
= mtdblock_open,

.flush     
= mtdblock_flush,

.release   
= mtdblock_release,

.readsect  
= mtdblock_readsect,

.writesect 
= mtdblock_writesect,

.add_mtd   
= mtdblock_add_mtd,

.remove_dev = mtdblock_remove_dev,

.owner     
= THIS_MODULE,

};

static struct
block_device_operations mtd_blktrans_ops = {

.owner     
= THIS_MODULE,

.open      
= blktrans_open,

.release   
= blktrans_release,

.locked_ioctl   = blktrans_ioctl,

.getgeo    
= blktrans_getgeo,

};

函数操作的对象为mtd_info,底层调用了mtd_info的读写擦除操作。

主要原理是将Flash的erase block 中的数据在内存中建立映射,然后对其进行修改,最后擦除Flash 上的block,将内存中的映射块写入Flash 块。整个过程被称为read/modify/erase/rewrite 周期。 但是,这样做是不安全的,当下列操作序列发生时,read/modify/erase/poweroff,就会丢失这个block 块的数据。

块设备模拟驱动按照block 号和偏移量来定位文件,因此在Flash 上除了文件数据,基本没有额外的控制数据。

3. mtd原始设备层-mtdcore.c

core registration and
callback routines for MTD drivers and users.

EXPORT_SYMBOL_GPL(add_mtd_device);

EXPORT_SYMBOL_GPL(del_mtd_device);

EXPORT_SYMBOL_GPL(get_mtd_device);

EXPORT_SYMBOL_GPL(get_mtd_device_nm);

EXPORT_SYMBOL_GPL(put_mtd_device);

EXPORT_SYMBOL_GPL(register_mtd_user);

EXPORT_SYMBOL_GPL(unregister_mtd_user);

EXPORT_SYMBOL_GPL(default_mtd_writev);

extern int add_mtd_device(struct mtd_info *mtd);

extern int del_mtd_device (struct mtd_info *mtd);

extern struct mtd_info *get_mtd_device(struct mtd_info *mtd, int num);

extern struct mtd_info *get_mtd_device_nm(const char *name);

extern void put_mtd_device(struct mtd_info *mtd);

struct mtd_notifier {

void (*add)(struct mtd_info *mtd);

void (*remove)(struct mtd_info *mtd);

struct list_head list;

};

extern void register_mtd_user (struct mtd_notifier *new);

extern int unregister_mtd_user (struct mtd_notifier *old);

int default_mtd_writev(struct mtd_info *mtd, const struct kvec *vecs,

unsigned long count, loff_t to, size_t *retlen);

int default_mtd_readv(struct mtd_info *mtd, struct kvec *vecs,

unsigned long count, loff_t from, size_t *retlen);

struct mtd_info *get_mtd_device(struct mtd_info *mtd, int num)
--获取一个可用的mtd设备句柄。
@mtd: last known address of the required MTD device
@num: internal device number of the required MTD device
Given a number and NULL address, return the num’th entry in the device table, if any.
Given an address and num==-1, search the device table for a device with that address and return if it’s still present.
Given both, return the num’th driver only if its address matches.

4. mtd原始设备层-mtdsuper.c

对mtd superblock管理。

extern int
get_sb_mtd(struct file_system_type *fs_type, int flags,

const char *dev_name, void *data,

int (*fill_super)(struct
super_block *, void *, int),

struct vfsmount *mnt);

extern void
kill_mtd_super(struct super_block *sb);

5. mtd原始设备层-mtdbdi.c

mtd backing device
capabilities。

/*

* backing device capabilities for non-mappable
devices (such as NAND flash)

* - permits private mappings, copies are taken
of the data

*/

struct
backing_dev_info mtd_bdi_unmappable = {

.capabilities   = BDI_CAP_MAP_COPY,

};

6. mtd原始设备层-mtdpart.c

对分区支持。

int
add_mtd_partitions(struct mtd_info *, const struct mtd_partition *, int);

int
del_mtd_partitions(struct mtd_info *);

/*

* Functions dealing with the various ways of
partitioning the space

*/

struct
mtd_part_parser {

struct list_head list;

struct module *owner;

const char *name;

int (*parse_fn)(struct mtd_info *, struct
mtd_partition **, unsigned long);

};

extern int
register_mtd_parser(struct mtd_part_parser *parser);

extern int
deregister_mtd_parser(struct mtd_part_parser *parser);

extern int
parse_mtd_partitions(struct mtd_info *master, const char **types,

struct mtd_partition **pparts,
unsigned long origin);

用master分区初始化其余所有分区。

7. mtd原始设备层-mtdcmdline.c

Read flash partition
table from command line

struct
cmdline_mtd_partition {

struct cmdline_mtd_partition *next;

char *mtd_id;

int num_parts;

struct mtd_partition *parts;

};

static struct
mtd_part_parser cmdline_parser = {

.owner = THIS_MODULE,

.parse_fn = parse_cmdline_partitions,

.name = "cmdlinepart",

};

static int
__init cmdline_parser_init(void)

{

return
register_mtd_parser(&cmdline_parser);

}

8. 重要函数重复

flash驱动中使用如下两个函数来注册和注销MTD设备:

int
add_mtd_device(struct mtd_info *mtd);

int
del_mtd_device(struct mtd_info *mtd);

flash驱动中使用如下两个函数注册和注销分区:

int
add_mtd_partitions(struct mtd_info *master, struct mtd_partition *parts, int
nbparts);

int
del_mtd_partitions(struct mtd_info *master);

七. NorFlash相关

NorFlash相关位于drivers/mtd/chips下。

探测flash可通过CFI接口(cfi_probe.c)或JEDEC接口(jedec_probe.c),两种设备都要用到gen_probe.c文件。

不同的制造商使用不同的命令集,目前Linux的MTD实现的命令集有AMD/Fujitsu的标准命令集和Intel/Sharp的扩展命令集(兼容Intel/Sharp标准命令集)两个,这两个命令集分别在cfi_cmdset_0002.c和cfi_cmdset_0001.c中实现。此外还有一些非CFI标准的Flash,其中“jedec”类型的Flash的探测程序在jedec.c中,“sharp”类型的Flash的探测程序在sharp.c中,“amd_flash”类型的Flash的探测程序在amd_flash.c中。最后,还有一些非Flash的MTD,比如ROM或absent(无)设备。这些设备的探测程序在map_rom.c、map_ram.c和map_absent.c中。所有类型的芯片都通过chipreg.c中的do_map_probe()程序驱动。

NorFlash驱动的核心是定义map_info结构体,它指定了NorFlash的基址、位宽、大小等信息以及Flash的读写函数。可认为NorFlash驱动就是根据map_info探测芯片的过程。

/* The map stuff
is very simple. You fill in your struct map_info with

a handful of routines for accessing the
device, making sure they handle

paging etc. correctly if your device needs
it. Then you pass it off

to a chip probe routine -- either JEDEC or
CFI probe or both -- via

do_map_probe(). If a chip is recognised, the
probe code will invoke the

appropriate chip driver (if present) and
return a struct mtd_info.

At which point, you fill in the
mtd->module with your own module

address, and register it with the MTD core
code. Or you could partition

it and register the partitions instead, or
keep it for your own private

use; whatever.

The mtd->priv field will point to the
struct map_info, and any further

private data required by the chip driver is
linked from the

mtd->priv->fldrv_priv field. This
allows the map driver to get at

the destructor function
map->fldrv_destroy() when it's tired

of living.

*/

struct map_info
{

const char *name;

unsigned long size;

resource_size_t phys;

#define NO_XIP
(-1UL)

void __iomem *virt;

void *cached;

int bankwidth; /* in octets. This isn't
necessarily the width

of actual bus cycles -- it's the
repeat interval

in bytes, before you are talking to the first
chip again.

*/

#ifdef
CONFIG_MTD_COMPLEX_MAPPINGS

map_word (*read)(struct map_info *,
unsigned long);

void (*copy_from)(struct map_info *, void
*, unsigned long, ssize_t);

void (*write)(struct map_info *, const
map_word, unsigned long);

void (*copy_to)(struct map_info *, unsigned
long, const void *, ssize_t);

/* We can perhaps put in 'point' and
'unpoint' methods, if we really

want to enable XIP for non-linear
mappings. Not yet though. */

#endif

/* It's possible for the map driver to use
cached memory in its

copy_from implementation (and _only_
with copy_from).  However,

when the chip driver knows some flash
area has changed contents,

it
will signal it to the map driver through this routine to let

the map driver invalidate the
corresponding cache as needed.

If there is no cache to care about this
can be set to NULL. */

void (*inval_cache)(struct map_info *,
unsigned long, ssize_t);

/* set_vpp() must handle being reentered --
enable, enable, disable

must leave it enabled. */

void (*set_vpp)(struct map_info *, int);

unsigned long pfow_base;

unsigned long map_priv_1;

unsigned long map_priv_2;

void *fldrv_priv;

struct mtd_chip_driver *fldrv;

};

struct
mtd_chip_driver {

struct mtd_info *(*probe)(struct map_info
*map);

void (*destroy)(struct mtd_info *);

struct module *module;

char *name;

struct list_head list;

};

void
register_mtd_chip_driver(struct mtd_chip_driver *);

void
unregister_mtd_chip_driver(struct mtd_chip_driver *);

struct mtd_info
*do_map_probe(const char *name, struct map_info *map);

void
map_destroy(struct mtd_info *mtd);

linux内核中mtd架构分析

Norflash芯片驱动的具体实现在mtd/maps下,如9261支持的flash驱动实现为mtd/maps/at91sam9261.c。

一个MTD原始设备可以由一块或者数块相同的Flash芯片组成。假设由4块devicetype为x8的Flash,每块大小为8M,interleave为2,起始地址为0x01000000,地址相连,则构成一个MTD原始设备(0x01000000-0x03000000),其中两块interleave成一个chip,其地址从0x01000000到0x02000000,另两块interleave成一个chip,其地址从0x02000000到0x03000000。

请注意,所有组成一个MTD原始设备的Flash芯片必须是同类型的(无论是interleave还是地址相连),在描述MTD原始设备的数据结构中也只是采用了同一个结构来描述组成它的Flash芯片。

八. NandFlash相关

NandFlash相关位于drivers/mtd/nand下,nandflash编程接口可参考

http://www.linux-mtd.infradead.org/tech/mtdnand/index.html

因为此目录实现了通用的NAND驱动(mtd/nand/nand_base.c),因此芯片级的NAND驱动不在需要实现mtd_info结构体中的read()、write()、read_oob()、write_oob()等成员函数,而主体转移到了nand_chip数据结构。

MTD使用nand_chip来表示一个NAND Flash芯片,该结构体包含了关于Nand Flash的地址信息、读写方法、ECC模式、硬件控制等一系列底层机制。

通过Kconfig和Makefile可简单了解该目录下内容,使用mtd nand必须包含文件nand_base.c、nand_bbt.c、nand_ecc.c、nand_ids.c;此外就是Nandflash控制芯片驱动,atmel的是atmel_nand.c,ti的omap2.c。

nand_bbt.c:
Bad block table support for the NAND driver

nand_ecc.c:
an ECC algorithm that detects and corrects 1 bit errors in a 256 byte block of
data.

nand_ids.c:
chip id list. Name, ID code, pagesize, chipsize in MegaByte, eraseblock size,
options. Manufacturer ID list.

nand_base.c:
the generic mtd driver for Nand flash devices.实现nand_chip到mtd_info的转换。

atmel_nand.c:
实现芯片级nand_chip支持。

/**

* struct nand_chip - NAND Private Flash Chip
Data

* @IO_ADDR_R:      [BOARDSPECIFIC] address to read the 8 I/O
lines of the flash device

* @IO_ADDR_W:      [BOARDSPECIFIC] address to write the 8
I/O lines of the flash device

* @read_byte:      [REPLACEABLE] read one byte from the chip

* @read_word:      [REPLACEABLE] read one word from the chip

* @write_buf:      [REPLACEABLE] write data from the buffer
to the chip

* @read_buf:       [REPLACEABLE] read data from the chip
into the buffer

* @verify_buf:     [REPLACEABLE] verify buffer contents
against the chip data

* @select_chip:    [REPLACEABLE] select chip nr

* @block_bad:      [REPLACEABLE] check, if the block is bad

* @block_markbad:  [REPLACEABLE] mark the block bad

* @cmd_ctrl:       [BOARDSPECIFIC] hardwarespecific funtion
for controlling

*         
ALE/CLE/nCE. Also used to write command and address

* @dev_ready:      [BOARDSPECIFIC] hardwarespecific function
for accesing device ready/busy line

*         
If set to NULL no access to ready/busy is available and the ready/busy
information

*         
is read from the chip status register

* @cmdfunc:        [REPLACEABLE] hardwarespecific function
for writing commands to the chip

* @waitfunc:       [REPLACEABLE] hardwarespecific function
for wait on ready

* @ecc:       
[BOARDSPECIFIC] ecc control ctructure

* @buffers:        buffer structure for read/write

* @hwcontrol:      platform-specific hardware control
structure

* @ops:       
oob operation operands

* @erase_cmd:      [INTERN] erase command write function,
selectable due to AND support

* @scan_bbt:       [REPLACEABLE] function to scan bad block
table

* @chip_delay:     [BOARDSPECIFIC] chip dependent delay for
transfering data from array to read regs (tR)

* @state:     
[INTERN] the current state of the NAND device

* @oob_poi:        poison value buffer

* @page_shift:     [INTERN] number of address bits in a page
(column address bits)

*
@phys_erase_shift:   [INTERN] number of
address bits in a physical eraseblock

* @bbt_erase_shift:    [INTERN] number of address bits in a bbt
entry

* @chip_shift:     [INTERN] number of address bits in one
chip

* @options:        [BOARDSPECIFIC] various chip options.
They can partly be set to inform nand_scan about

*         
special functionality. See the defines for further explanation

* @badblockpos:    [INTERN] position of the bad block marker
in the oob area

* @cellinfo:       [INTERN] MLC/multichip data from chip
ident

* @numchips:       [INTERN] number of physical chips

* @chipsize:       [INTERN] the size of one chip for
multichip arrays

* @pagemask:       [INTERN] page number mask = number of
(pages / chip) - 1

* @pagebuf:        [INTERN] holds the pagenumber which is
currently in data_buf

* @subpagesize:    [INTERN] holds the subpagesize

* @ecclayout:      [REPLACEABLE] the default ecc placement
scheme

* @bbt:       
[INTERN] bad block table pointer

* @bbt_td:    
[REPLACEABLE] bad block table descriptor for flash lookup

* @bbt_md:    
[REPLACEABLE] bad block table mirror descriptor

* @badblock_pattern:   [REPLACEABLE] bad block scan pattern used
for initial bad block scan

* @controller:     [REPLACEABLE] a pointer to a hardware
controller structure

*         
which is shared among multiple independend devices

* @priv:      
[OPTIONAL] pointer to private chip date

* @errstat:        [OPTIONAL] hardware specific function
to perform additional error status checks

*         
(determine if errors are correctable)

* @write_page:     [REPLACEABLE] High-level page write
function

*/

struct nand_chip
{

void 
__iomem   *IO_ADDR_R;

void 
__iomem   *IO_ADDR_W;

uint8_t    
(*read_byte)(struct mtd_info *mtd);

u16    
(*read_word)(struct mtd_info *mtd);

void       
(*write_buf)(struct mtd_info *mtd, const uint8_t *buf, int len);

void       
(*read_buf)(struct mtd_info *mtd, uint8_t *buf, int len);

int    
(*verify_buf)(struct mtd_info *mtd, const uint8_t *buf, int len);

void       
(*select_chip)(struct mtd_info *mtd, int chip);

int    
(*block_bad)(struct mtd_info *mtd, loff_t ofs, int getchip);

int    
(*block_markbad)(struct mtd_info *mtd, loff_t ofs);

void     
  (*cmd_ctrl)(struct mtd_info
*mtd, int dat,

unsigned int ctrl);

int    
(*dev_ready)(struct mtd_info *mtd);

void       
(*cmdfunc)(struct mtd_info *mtd, unsigned command, int column, int
page_addr);

int    
(*waitfunc)(struct mtd_info *mtd, struct nand_chip *this);

void       
(*erase_cmd)(struct mtd_info *mtd, int page);

int    
(*scan_bbt)(struct mtd_info *mtd);

int    
(*errstat)(struct mtd_info *mtd, struct nand_chip *this, int state, int
status, int page);

int    
(*write_page)(struct mtd_info *mtd, struct nand_chip *chip,

const uint8_t *buf, int
page, int cached, int raw);

int    
chip_delay;

unsigned int    options;

int    
page_shift;

int    
phys_erase_shift;

int    
bbt_erase_shift;

int    
chip_shift;

int    
numchips;

uint64_t   
chipsize;

int    
pagemask;

int    
pagebuf;

int    
subpagesize;

uint8_t    
cellinfo;

int    
badblockpos;

nand_state_t    state;

uint8_t    
*oob_poi;

struct nand_hw_control  *controller;

struct nand_ecclayout   *ecclayout;

struct nand_ecc_ctrl ecc;

struct nand_buffers *buffers;

struct nand_hw_control hwcontrol;

struct mtd_oob_ops ops;

uint8_t    
*bbt;

struct nand_bbt_descr   *bbt_td;

struct nand_bbt_descr   *bbt_md;

struct nand_bbt_descr   *badblock_pattern;

void       
*priv;

};

linux内核中mtd架构分析

九. Flash测试

mtd flash测试程序分为内核层和应用层。

内核层:drivers/mtd/tests,包含mtd_oobtest, mtd_pagetest, mtd_readtest, mtd_speedtest, mtd_subpagetest,

mtd_torturetest, mtd_nandecctest, mtd_nandbiterrs,通过模块参数传参完成测试。

应用层:http://www.linux-mtd.infradead.org/提供的mtd user-space tools。

十. 附录

MTDMemory
Technology Device,内存技术设备

JEDECJoint Electron Device Engineering
Council,电子电器设备联合会

CFICommon Flash Interface,通用Flash接口,Intel发起的一个Flash的接口标准

OOB out of band,某些内存技术支持out-of-band数据——例如,NAND flash每512字节的块有16个字节的extra data,用于纠错或元数据。

ECC error correction,某些硬件不仅允许对flash的访问,也有ecc功能,所有flash器件都受位交换现象的困扰。在某些情况下,一个比特位会发生反转或被报告反转了,如果此位真的反转了,就要采用ECC算法。

erasesize 一个erase命令可以擦除的最小块的尺寸

buswidthMTD设备的接口总线宽度

interleave交错数,几块芯片平行连接成一块芯片,使buswidth变大

OTP:one-time
programmable一次可编程,仅可编程一次。程序烧入IC后,将不可再次更改。

BBT:Bad
block table

参考:

  1. http://www.linux-mtd.infradead.org/
  2. http://www.linux-mtd.infradead.org/archive/index.html
  3. linux设备驱动开发详解  宋宝华
  4. http://blog.csdn.net/bugouyonggan/article/details/9167213
  5. http://blog.csdn.net/lizhiguo0532/article/details/6007636
  6. http://www.linux-mtd.infradead.org/tech/mtdnand/index.html
上一篇:kvm tboot和libvirt的安装


下一篇:Linux串口设备树硬件、软件流控设置