Subject: Re: letting userland issue FUA writes
To: Joachim K?nig-Baltes <>
From: Thor Lancelot Simon <>
List: tech-kern
Date: 03/17/2006 14:11:24
On Fri, Mar 17, 2006 at 09:43:04AM +0100, Joachim K?nig-Baltes wrote:

> Or add a "const void *buf" argument to fsync_range, not as general as
> with pwritev or adding syscalls, but perhaps less intrusive on the
> interfaces.

I can't imagine why this would be "less intrusive" than adding an
optional flags argument to pwritev.  What is *buf supposed to mean,
here, "write this right now and treat it as if it had had fsync_range
applied to it after the fact"?  That seems gross, and it's no smaller
an interface change than the pwritev() one.

It seems to me that pwritev() or something like it is exactly the
right interface for passing a raw write request down through userland
and the kernel to a disk.

In fact, I think by careful use of "flags" one could even preserve
exact SCSI tag semantics down to the disk layer -- though such an
writev would have to have at least one "ordered" flag set in it
since writev is not asynchronous.  But because every request in
pwritev-with-flags would carry offset, length, and a flag word, it
is actually possible to express all the information the disk needs
to handle the write exactly as it was handed to you by the initiator,
off across the net, who thinks _you're_ a disk yourself, if that's
what you're after.

  Thor Lancelot Simon	                           

  "We cannot usually in social life pursue a single value or a single moral
   aim, untroubled by the need to compromise with others."      - H.L.A. Hart