Subject: Re: disks write-back cache
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Jason Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 04/26/2003 10:00:44
On Saturday, April 26, 2003, at 09:05  AM, Manuel Bouyer wrote:

> This cause problems for filesystems or applications that take measure 
> to
> prevent problems in case of unattended reboot (e.g. FFS, or sendmail).
> Until today I though I was safe when using SCSI disks.
> I think the kernel should print a warning when it probes a disk with 
> the
> write cache enabled.

Perhaps the kernel should print the cache enabled status of the drive 
with the autoconfiguration messages?

> I did some benchmarks here, and it seems tagged queuing mostly hide the
> imrpovement of write-back cache. On two different filesystems (on top 
> of
> RAID-1 raidframe devices), I see a performance decrease of 2-5% writing
> a 640MB file (tested on different servers, the decrease is dependant 
> on the
> disk model, and maybe filesystems parameters).

The tagged queueing thing is interesting.  It's actually a bit more 
complicated than you describe.  The problem is that not all drives 
allow commands to be reordered, so effectively every tag is an ordered 
tag.  I believe the command ordering behavior is adjustable with a mode 
page setting, but I don't remember which one.

Anyway, if the drive isn't going to reorder commands, then your 
performance can be really bad with the w/b cache disabled.  We should 
probably have some dkctl(8) settings that allow tuning these other 
kinds of disk parameters.

Also note that some drives will suffer tag starvation if you enable 
command reordering, e.g. it will wait "forever" to complete simple-tag 
commands because it's stupid :-)  The way to work around this is to 
periodically send an ordered-tag command to the drive (or maybe even 
when the number of openings on the drive crosses some low-water mark).

> Or maybe put it in the filesystem layer, at mount time ?

Well... Another idea might be to make the file systems w/b cache-aware. 
  I've mentioned this idea to a few people before, but no one seems to 
think it's necessary.  Anyway, the idea is that you make the file 
system issue cache flushes at its own barrier points (either explicitly 
with a separate command, or by setting a flag in its I/O request which 
causes the disk driver to do so at the end of that I/O).

However, that's a lot of work, so issuing a warning might not be a 
horrible idea... but it should might be annoying to see them all the 
time.

         -- Jason R. Thorpe <thorpej@wasabisystems.com>