Subject: Re: RAW access to files
To: Ignatios Souvatzis <>
From: Chuck Silvers <>
List: tech-kern
Date: 12/11/2001 23:58:42
one situation where unbuffered i/o is useful is when the application is
doing random i/o on a data set much larger than memory (eg. a database).
in this case, read ahead is not useful, and the application is usually doing
its own caching, so caching the data again in the kernel wastes memory.
the other benefit from doing the i/o straight into the application's
memory is that it saves copying the data from the cache to the
application's memory.  obviously this isn't something that most applications
would be interested in, but for databases it's a big improvement.

for applications that do large runs of sequential i/o, the same logic
applies as well, if the application isn't going to access the data multiple
times and it uses i/os of at least 64k.  read ahead doesn't gain you all
that much when you're doing large i/os, especially on modern disks that
do read ahead into the cache in the disk.  in this case there's basically
no difference to the disk driver between doing the i/o to filesystem cache
pages vs. doing the i/o straight to the application, and doing it direct saves
a memory-to-memory copy.  for large i/os, we can also use UVM loaning to
avoid the memory-to-memory copy in some cases, but in cases where loaning
is impossible (or undesirable for whatever reason) then direct i/o
would also be valuable.

I don't know how much we'll want to change system utilities to make use
of this mode of i/o, but once the feature is available, we can experiment
with it and see where it works well.


On Tue, Dec 11, 2001 at 11:38:22AM +0100, Ignatios Souvatzis wrote:
> On Mon, Dec 10, 2001 at 07:42:07PM +0100, Wojciech Puchar wrote:
> > > this is more often called "direct i/o" or "unbuffered i/o".
> > > it's on the list of things I want to add, but I haven't had time yet.
> > 
> > it would be really good! with support in common mpeg/divx players etc..
> > and with option in cp and dd
> Why do you think that "raw" I/O would be more efficient than the filesystem
> + buffer cache doing readahead? 
> 	-is