前言
对于许多类型的设备,创建 Linux 内核驱动程序是大材小用。真正需要的只是某种处理中断并提供对设备内存空间的访问的方法。控制设备的逻辑不一定必须在内核中,因为设备不需要利用内核提供的任何其他资源。像这样的一类常见设备是用于工业I / O卡。
为了解决这种情况,设计了用户空间 I/O 系统 (UIO)。对于典型的工业I/O卡,只需要一个非常小的内核模块。驱动程序的主要部分将在用户空间中运行。这简化了开发,并降低了内核模块中出现严重错误的风险。
请注意,UIO 不是通用驱动程序接口。已经由其他内核子系统(如网络或串行或 USB)处理良好的设备不是 UIO 驱动程序的候选项。非常适合 UIO 驱动程序的硬件满足以下条件:
- 设备具有可映射的内存。通过写入此内存可以完全控制设备。
- 设备通常会生成中断。
- 设备不适合标准内核子系统之一。
UIO 的工作原理
每个 UIO 设备都可通过一个设备文件和多个 sysfs 属性文件进行访问。第一台设备的设备文件叫/dev/uio0,并依此类推/dev/uio1/,dev/uio2
/dev/uioX用于访问卡的地址空间。只需用于访问卡的寄存器或RAM位置即可。mmap()
中断是通过读取/dev/uioX 来处理的。一旦发生中断,阻塞read()将立即返回。还可以在/dev/uioX上使用 select()来等待中断。从/dev/uioX中读取的整数值表示总中断计数。您可以使用此数字来确定是否错过了一些中断
对于某些在内部具有多个中断源但没有单独的 IRQ 掩码和状态寄存器的硬件,如果内核处理程序通过写入芯片的 IRQ 寄存器来禁用中断源,则用户空间可能无法确定中断源是什么。在这种情况下,内核必须完全禁用IRQ,以使芯片的寄存器保持不变。从而,用户空间部分可以确定中断的原因,但它无法重新启用中断
为了解决这些问题,UIO 还实现了一个 write() 函数。它通常不被使用,对于只有一个中断源或具有单独的 IRQ 掩码和状态寄存器的硬件,可以忽略它。但是,如果需要,则写入/dev/uioX操作将调用驱动程序实现的函数irqcontrol()。您必须写入一个通常为 0 或 1 的 32 位值才能禁用或启用中断。如果驱动程序未实现 irqcontrol(),write(),将返回 -ENOSYS。
为了正确处理中断,自定义内核模块可以提供自己的中断处理程序。它将自动由内置处理程序调用。
对于不生成中断但需要轮询的设备,可以设置一个计时器,以可配置的时间间隔触发中断处理程序。此中断模拟是通过从计时器的事件处理程序调用uio_event_notify()来完成的。
每个驱动程序都提供用于读取或写入变量的属性。这些属性可通过 sysfs 文件访问。自定义内核驱动程序模块可以将其自己的属性添加到 uio 驱动程序拥有的设备,但此时不会添加到 UIO 设备本身。如果发现有用,将来可能会更改。
UIO 框架提供以下标准属性:
- name:设备的名称。建议为此使用内核模块的名称。
- version:由驱动程序定义的版本字符串。这允许驱动程序的用户空间部分处理不同版本的内核模块
- event:自上次读取设备节点以来驱动程序处理的中断总数。
这些属性显示在目录/sys/class/uio/uioX下。请注意,此目录可能是符号链接,而不是真正的目录。任何访问它的用户空间代码都必须能够处理这个问题。
每个 UIO 设备都可以使一个或多个内存区域可用于内存映射。这是必需的,因为某些工业 I/O 卡需要访问驱动程序中的多个 PCI 内存区域。
每个映射在 sysfs 中都有自己的目录,第一个映射显示为/sys/class/uio/uioX/maps/map0 。后续映射创建目录/map1、/map2/ 等。仅当映射的大小不为 0 时,才会显示这些目录。
每个mapX/目录包含四个只读文件,这些文件显示内存的属性:
- name:此映射的字符串标识符。这是可选的,字符串可以为空。驱动程序可以对此进行设置,以便用户空间更容易找到正确的映射。
- addr:可以映射的内存地址。
- size:addr 所指向的内存的大小(以字节为单位)。
- offset:必须添加到返回的指针以到达实际设备内存的偏移量(以字节为单位)。如果设备的内存未与页面对齐,这一点很重要。请记住,mmap()返回的指针始终与页面对齐,因此始终添加此偏移量是mmap()很好的样式
从用户空间中,通过调整调用的参数来区分不同的映射。要映射映射 N 的内存,必须使用 N 倍的页面大小作为偏移量:offsetmmap()
offset = N * getpagesize();
有时,有些硬件具有类似内存的区域,无法使用此处描述的技术进行映射,但仍有办法从用户空间访问它们。最常见的示例是 x86 ioport。在 x86 系统上,用户空间可以使用ioperm()、iopl()、inb()、outb()和类似函数访问这些 ioport。
由于这些ioport区域无法映射,因此它们不会像上面描述的正常内存那样出现在/sys/class/uio/uioX/maps/下。如果没有有关硬件必须提供的端口区域的信息,驱动程序的用户空间部分将很难找到哪些端口属于哪个 UIO 设备。
为了解决这种情况,添加了新目录/sys/class/uio/uioX/portio。仅当驱动程序要将有关一个或多个端口区域的信息传递到用户空间时,它才存在。如果是这种情况,名为/port0、/port1的子目录将出现在sys/class/uio/uioX/portio/下面。
每个目录包含四个只读文件,这些文件显示端口区域的名称、开始时间、大小和类型:portX/
- name:此端口区域的字符串标识符。该字符串是可选的,可以为空。驱动程序可以对其进行设置,以便用户空间更轻松地找到某个端口区域。
- start:此区域的第一个端口。
- size:此区域中的端口数。
- porttype:描述端口类型的字符串。
编写自己的内核模块
struct uio_info
This structure tells the framework the details of your driver, Some of the members are required, others are optional.
-
const char *name
: Required. The name of your driver as it will appear in sysfs. I recommend using the name of your module for this. -
const char *version
: Required. This string appears in/sys/class/uio/uioX/version
. -
struct uio_mem mem[ MAX_UIO_MAPS ]
: Required if you have memory that can be mapped withmmap()
. For each mapping you need to fill one of theuio_mem
structures. See the description below for details. -
struct uio_port port[ MAX_UIO_PORTS_REGIONS ]
: Required if you want to pass information about ioports to userspace. For each port region you need to fill one of theuio_port
structures. See the description below for details. -
long irq
: Required. If your hardware generates an interrupt, it’s your modules task to determine the irq number during initialization. If you don’t have a hardware generated interrupt but want to trigger the interrupt handler in some other way, setirq
toUIO_IRQ_CUSTOM
. If you had no interrupt at all, you could setirq
toUIO_IRQ_NONE
, though this rarely makes sense. -
unsigned long irq_flags
: Required if you’ve setirq
to a hardware interrupt number. The flags given here will be used in the call to[request_irq()](https://www.kernel.org/doc/html/latest/core-api/genericirq.html#c.request_irq)
. -
int (*mmap)(struct uio_info *info, struct vm_area_struct *vma)
: Optional. If you need a specialmmap()
function, you can set it here. If this pointer is not NULL, yourmmap()
will be called instead of the built-in one. -
int (*open)(struct uio_info *info, struct inode *inode)
: Optional. You might want to have your ownopen()
, e.g. to enable interrupts only when your device is actually used. -
int (*release)(struct uio_info *info, struct inode *inode)
: Optional. If you define your ownopen()
, you will probably also want a customrelease()
function. -
int (*irqcontrol)(struct uio_info *info, s32 irq_on)
: Optional. If you need to be able to enable or disable interrupts from userspace by writing to/dev/uioX
, you can implement this function. The parameterirq_on
will be 0 to disable interrupts and 1 to enable them.
Usually, your device will have one or more memory regions that can be mapped to user space. For each region, you have to set up a struct uio_mem
in the mem[]
array. Here’s a description of the fields of struct uio_mem
:
-
const char *name
: Optional. Set this to help identify the memory region, it will show up in the corresponding sysfs node. -
int memtype
: Required if the mapping is used. Set this toUIO_MEM_PHYS
if you have physical memory on your card to be mapped. UseUIO_MEM_LOGICAL
for logical memory (e.g. allocated with__get_free_pages()
but not[kmalloc()](https://www.kernel.org/doc/html/latest/core-api/mm-api.html#c.kmalloc)
). There’s alsoUIO_MEM_VIRTUAL
for virtual memory. -
phys_addr_t addr
: Required if the mapping is used. Fill in the address of your memory block. This address is the one that appears in sysfs. -
resource_size_t size
: Fill in the size of the memory block thataddr
points to. Ifsize
is zero, the mapping is considered unused. Note that you must initializesize
with zero for all unused mappings. -
void *internal_addr
: If you have to access this memory region from within your kernel module, you will want to map it internally by using something like[ioremap()](https://www.kernel.org/doc/html/latest/driver-api/device-io.html#c.ioremap)
. Addresses returned by this function cannot be mapped to user space, so you must not store it inaddr
. Useinternal_addr
instead to remember such an address.
在用户层写驱动 Writing a driver in userspace
Once you have a working kernel module for your hardware, you can write the userspace part of your driver. You don’t need any special libraries, your driver can be written in any reasonable language, you can use floating point numbers and so on. In short, you can use all the tools and libraries you’d normally use for writing a userspace application.
Getting information about your UIO device
Information about all UIO devices is available in sysfs. The first thing you should do in your driver is check name
and version
to make sure you’re talking to the right device and that its kernel driver has the version you expect.
You should also make sure that the memory mapping you need exists and has the size you expect.
There is a tool called lsuio
that lists UIO devices and their attributes. It is available here:
http://www.osadl.org/projects/downloads/UIO/user/
With lsuio
you can quickly check if your kernel module is loaded and which attributes it exports. Have a look at the manpage for details.
The source code of lsuio
can serve as an example for getting information about an UIO device. The file uio_helper.c
contains a lot of functions you could use in your userspace driver code.
mmap() device memory
After you made sure you’ve got the right device with the memory mappings you need, all you have to do is to call mmap()
to map the device’s memory to userspace.
The parameter offset
of the mmap()
call has a special meaning for UIO devices: It is used to select which mapping of your device you want to map. To map the memory of mapping N, you have to use N times the page size as your offset:
offset = N * getpagesize();
N starts from zero, so if you’ve got only one memory range to map, set offset = 0
. A drawback of this technique is that memory is always mapped beginning with its start address.
Waiting for interrupts
After you successfully mapped your devices memory, you can access it like an ordinary array. Usually, you will perform some initialization. After that, your hardware starts working and will generate an interrupt as soon as it’s finished, has some data available, or needs your attention because an error occurred.
/dev/uioX
is a read-only file. A read()
will always block until an interrupt occurs. There is only one legal value for the count
parameter of read()
, and that is the size of a signed 32 bit integer (4). Any other value for count
causes read()
to fail. The signed 32 bit integer read is the interrupt count of your device. If the value is one more than the value you read the last time, everything is OK. If the difference is greater than one, you missed interrupts.
You can also use select()
on /dev/uioX
.
Generic PCI UIO driver
The generic driver is a kernel module named uio_pci_generic. It can work with any device compliant to PCI 2.3 (circa 2002) and any compliant PCI Express device. Using this, you only need to write the userspace driver, removing the need to write a hardware-specific kernel module.
Making the driver recognize the device
Since the driver does not declare any device ids, it will not get loaded automatically and will not automatically bind to any devices, you must load it and allocate id to the driver yourself. For example:
modprobe uio_pci_generic echo "8086 10f5" > /sys/bus/pci/drivers/uio_pci_generic/new_id
If there already is a hardware specific kernel driver for your device, the generic driver still won’t bind to it, in this case if you want to use the generic driver (why would you?) you’ll have to manually unbind the hardware specific driver and bind the generic driver, like this:
echo -n 0000:00:19.0 > /sys/bus/pci/drivers/e1000e/unbind echo -n 0000:00:19.0 > /sys/bus/pci/drivers/uio_pci_generic/bind
You can verify that the device has been bound to the driver by looking for it in sysfs, for example like the following:
ls -l /sys/bus/pci/devices/0000:00:19.0/driver
Which if successful should print:
.../0000:00:19.0/driver -> ../../../bus/pci/drivers/uio_pci_generic
Note that the generic driver will not bind to old PCI 2.2 devices. If binding the device failed, run the following command:
dmesg
and look in the output for failure reasons.
Things to know about uio_pci_generic
Interrupts are handled using the Interrupt Disable bit in the PCI command register and Interrupt Status bit in the PCI status register. All devices compliant to PCI 2.3 (circa 2002) and all compliant PCI Express devices should support these bits. uio_pci_generic detects this support, and won’t bind to devices which do not support the Interrupt Disable Bit in the command register.
On each interrupt, uio_pci_generic sets the Interrupt Disable bit. This prevents the device from generating further interrupts until the bit is cleared. The userspace driver should clear this bit before blocking and waiting for more interrupts.
Writing userspace driver using uio_pci_generic
Userspace driver can use pci sysfs interface, or the libpci library that wraps it, to talk to the device and to re-enable interrupts by writing to the command register.
Example code using uio_pci_generic
Here is some sample userspace driver code using uio_pci_generic:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
int main()
{
int uiofd;
int configfd;
int err;
int i;
unsigned icount;
unsigned char command_high;
uiofd = open("/dev/uio0", O_RDONLY);
if (uiofd < 0) {
perror("uio open:");
return errno;
}
configfd = open("/sys/class/uio/uio0/device/config", O_RDWR);
if (configfd < 0) {
perror("config open:");
return errno;
}
/* Read and cache command value */
err = pread(configfd, &command_high, 1, 5);
if (err != 1) {
perror("command config read:");
return errno;
}
command_high &= ~0x4;
for(i = 0;; ++i) {
/* Print out a message, for debugging. */
if (i == 0)
fprintf(stderr, "Started uio test driver.\n");
else
fprintf(stderr, "Interrupts: %d\n", icount);
/****************************************/
/* Here we got an interrupt from the
device. Do something to it. */
/****************************************/
/* Re-enable interrupts. */
err = pwrite(configfd, &command_high, 1, 5);
if (err != 1) {
perror("config write:");
break;
}
/* Wait for next interrupt. */
err = read(uiofd, &icount, 4);
if (err != 4) {
perror("uio read:");
break;
}
}
return errno;
}
Generic Hyper-V UIO driver
The generic driver is a kernel module named uio_hv_generic. It supports devices on the Hyper-V VMBus similar to uio_pci_generic on PCI bus.
Making the driver recognize the device
Since the driver does not declare any device GUID’s, it will not get loaded automatically and will not automatically bind to any devices, you must load it and allocate id to the driver yourself. For example, to use the network device class GUID:
modprobe uio_hv_generic echo "f8615163-df3e-46c5-913f-f2d2f965ed0e" > /sys/bus/vmbus/drivers/uio_hv_generic/new_id
If there already is a hardware specific kernel driver for the device, the generic driver still won’t bind to it, in this case if you want to use the generic driver for a userspace library you’ll have to manually unbind the hardware specific driver and bind the generic driver, using the device specific GUID like this:
echo -n ed963694-e847-4b2a-85af-bc9cfc11d6f3 > /sys/bus/vmbus/drivers/hv_netvsc/unbind echo -n ed963694-e847-4b2a-85af-bc9cfc11d6f3 > /sys/bus/vmbus/drivers/uio_hv_generic/bind
You can verify that the device has been bound to the driver by looking for it in sysfs, for example like the following:
ls -l /sys/bus/vmbus/devices/ed963694-e847-4b2a-85af-bc9cfc11d6f3/driver
Which if successful should print:
.../ed963694-e847-4b2a-85af-bc9cfc11d6f3/driver -> ../../../bus/vmbus/drivers/uio_hv_generic