[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
raw disk device interface abstraction
We're all dancing around a very fundamental question here: what interface
abstraction should the "raw" interface to a disk controller (and attached
We're not going to allow userland to directly write device registers as a
general practice (X11 notwithstanding, and that's a glaring & horrible
exception to UNIX rules because we've been unwilling to put a full graphics
abstraction subsystem, with appropriate userland API, into the kernel (too big!
too ugly! no API agreement!), as we have with disks (filesystems), network
interfaces (protocol stacks), and serial devices (tty line disciplines)), and
userland code does not handle device interrupts; that's the kernel's job.
We do generally allow userland to initiate DMA (through system calls) directly
from userland memory - that's why the raw interface is generally faster than
the block interface: no byte copying. Oh, and you get to do I/O in chunks
larger than block interface is designed for, provided that the device (and
driver) permits it.
Then there's the whole addressing question. Disk blocks used to be addressed by
cylinder/head/sector numbers, and the driver translated between block numbers
and c/h/s; now, modern disks do that translation for us, and when asked about
c/h/s they even lie to us to hide their guts (or to follow very old
abstractions). And we're talking lately about disks with 4K native blocks
rather than the traditional 512 byte blocks (though you've been able to format
properly compliant SCSI disks to block sizes other than 512 bytes for a very
long time (decades)).
However, even "blocks" are an abstraction - UNIX wants to address everything in
bytes; just look at read(2), write(2), and lseek(2). No mention of "blocks" -
bytes are the fundamental (atomic) data & address unit of the system. We
translate that to everything else as required.
So, what should be the abstraction that the raw interface to a "disk" be? It's
going to have a translation from bytes to whatever the disk is addressed in.
The driver will handle manipulation of the device registers and handle
interrupts. Our memory allocators tend already to be conservative about
alignment, but would not be unreasonable for a device driver that knows de
facto that a device requires aligned DMA addresses to check what's requested in
read(2)/write(2) and return EINVAL as necessary (naturally, the device man page
should document all the reasons a driver will return an error). However, some
warts are just easier to handle in the device driver, rather than leave for
(less capable) userland code to deal with.
Another way to put the question: what is a disk? What are its fundamental
properties, and how can we design a reasonable abstraction (which in most cases
is probably not all that abstract) for userland code to reasonably deal with?
As with all things, we have tradeoffs to make; UNIX is pragmatic: a good
solution today to today's problems is better than a perfect solution (which
we've got to find some poor sod to implement!) tomorrow.
Main Index |
Thread Index |