tech-kern: Re: wd, disk write cache, sync cache, and softdep.

Subject: Re: wd, disk write cache, sync cache, and softdep.
To: J Chapman Flack <flack@cs.purdue.edu>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 12/17/2004 02:17:24
[ On Thursday, December 16, 2004 at 15:22:38 (-0500), J Chapman Flack wrote: ]
> Subject: Re: wd, disk write cache, sync cache, and softdep. 
>
> Is there any common practice existing or emerging that
> we could track?

In some professional storage systems (even not very new ones) the write
cache can only be enabled if there's a battery connected (and fully
charged) to keep it stable.  Some systems automatically disable the
write cache if the battery goes low or dead.  (I'm still trying to
figure out how to fool the battery detector and charge circuits in my
surplus StorageWorks controller because the batteries are just _way_ too
expensive to replace, especially every year as the OEM replacements only
have an 18-month lifespan!  :-)

Personally I'm happy if the storage system has a flag that the operator
can set to say, "Yes I've connected a big whopping UPS and I trust it to
keep my write caches stable until they are flushed, either automatically
or by my direct intervention."  And then I'll post a big red sign in the
machine room saying "Thou shalt not spin down any storage system or
device before clearly and carefully ensuring that all its data is
securely and safely stored permanently on it recordable media."

For this discussion that might mean that the disk drivers should
automatically always force the write cache off by default unless some
sysctl setting (optionally per-spindle/LUN/whatever-makes-sense) is
changed to say otherwise.  Then there should be a question in "sysinst"
that asks whether or not the system will be storing any critical data on
a non-UPS backed disk and if not then it can create an initial
/etc/sysctl.conf with that allow-write-cache flag turned on so that the
user will get all the performance, along with all the integrity, they
paid for.

Other than that there's really no point to fooling around with things
inside the system that are just going to waste time and make the outcome
ever more confusing.


Meanwhile I'm still hoping for the day when the disk drivers that can
will ensure that all possible automatic read&write recovery-on-error
features are enabled before a storage device is made ready for use!
I've had to clean up after way too many crashes (and even one would be
too many!) because some lame factory-configed SCSI disk gave back a hard
write error during swap because some fool operator (that'd often enough
have be me) had forgotten to check that the AWRE bit was set before
putting the system into production.


As for why I wouldn't just use "-o async", well that's still a wee bit
of disaster insurance I might hope to have against kernel crashes and
other bizarre behaviour that won't necessarily wipe out the write cache.

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>