Subject: Re: write cache on ATA drives
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 12/09/2002 23:07:38
On Mon, Dec 09, 2002 at 04:54:39PM -0500, Thor Lancelot Simon wrote:
> On Mon, Dec 09, 2002 at 10:20:21PM +0100, Manuel Bouyer wrote:
> > On Thu, Dec 05, 2002 at 06:25:33PM -0500, Thor Lancelot Simon wrote:
> > > What Windows does, evidently, is to disable and then re-enable write-back
> > > caching as barriers around performing a synchronous I/O to the drive.  This
> > > forces a cache flush, so you know the sync I/O went out, while letting the
> > > drive continue to reorder async I/O to normal files, retaining much of the
> > > performance benefit.  With softdep, you'd retain most of the rest -- as
> > > much as you could _safely_ retain, anyway.
> > 
> > This is easy to do. Unfortunably, I don't think the filesystems will pass
> > us this info ...
> 
> Of course they do.  The buffer's either B_ASYNC or it's not.  If it's 
> B_ASYNC, you may either turn the write cache on or leave the write cache on,
> depending on its previous setting.  If it is B_SYNC, you must turn the write
> cache off or leave it off, depending on its previous setting.
> 
> This is basically the same thing we do in SCSI drivers with ordered vs.
> simple tags, just in a much less elegant fashion because, well, IDE sucks.
> 
> Luckily, softdep allows us to eliminate the vast majority of synchronous
> I/O, so that, in theory at least, it should be unnecessary to change the
> cache state very often.  And NEW_BUFQ_STRATEGY might help some too, by
> clustering reads (which are never synchronous) and writes (which sometimes
> are) together more effectively, I think.

Is it also true for softdeps ?

> 
> Interestingly, this was the original genesis of B_ORDERED: with an explicit
> barrier operation, it's easier to force something like a cache flush (which
> is what sending a single ordered tag is equivalent to when there are many
> outstanding simple tags on a SCSI device, essentially) only when you want or
> need one.  What started Jason and I thinking about it was the question "what
> would you need to do to have arbitrary I/O reordering while ensuring that
> LFS segment writes were safe"?  In any case, Microsoft's example would seem
> to teach that it's possible to simply use any synchronous I/O as the barrier
> that flushes the cache, for some filesystems at least.  I'd be curious to
> know what impact this had on FFS performance.
> 
> There was some speculation that Microsoft actually had some way to make
> _single commands_ write-through on IDE devices, avoiding the full cache
> flush, but IIRC a Microsoft employee popped up in comp.arch.storage last
> time this was being discussed and denied that...

If this existed, this would not be something documented ...

I don't know for LFS, but if I remember properly, softdeps is doing its
own write barrier by waiting for some kind of I/O to complete before
starting new writes (basically it needs metadata I/O to be keept ordered
but data can be written in no specific order).

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 23 ans d'experience feront toujours la difference
--