Subject: Re: Extension of fsync_range() to permit forcing disk cache flushing
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 12/17/2004 23:46:02
On Fri, Dec 17, 2004 at 05:06:15PM -0500, Thor Lancelot Simon wrote:
> [...]
> 
> You can run SCSI drives with WCE (write cache enable) turned off, and
> because they have efficient support for multiple outstanding tagged
> commands, they can get acceptable write performance (with carefully
> crafted applications and filesystems) *without* having to allow writes
> to be marked as "complete" without being committed to stable storage.

Of course my important servers use SCSI, and of course they have write-back
disabled :)

> 
> That is, with SCSI disks, you can get acceptable performance for streams
> of small writes without letting the drive lie about whether the data is
> in cache or on the disk.
> 
> With IDE disks, you don't have disconnect/reconnect, multiple outstanding
> tagged commands, or ordered tags to serve as write barriers.  Your *only
> option* to allow the drive firmware to handle multiple commands at once,
> potentially ganging them up for efficient long writes, or reordering them
> for lower average latency, is to turn on the write cache.  If you don't
> turn it on, you get performance like a SCSI disk without tagged queueing:
> dreadful, often even for long writes, if you can't respond fast enough to
> the completion of one command to get the next one there before you miss
> a platter rotation.
> 
> The situation is simply not comparable, because the design of IDE is so
> broken that you simply can't get acceptable write performance without
> allowing the drive to claim it's completed writes when it hasn't.  That
> is not the case for SCSI.

Note that newer IDE drives supports tagged queuing, even for PATA. But the
controller has to support it too, and a developer needs to have such a
controller in hands, with programming docs, to add support. The hard part
is getting the docs :(

> 
> With newer SATA disks, *if we supported tagged command queueing*, which
> we don't, and only with controllers that actually supported it, we could
> do the right thing here.  But for the time being, with the hardware that's

I have such a controller, but not the docs ...

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--