tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Proposal: B_ARRIER (addresses wapbl performance?)

On Tue, Dec 09, 2008 at 02:47:50PM -0500, Thor Lancelot Simon wrote:
> [...]
> Note that the default tag reordering scheme isn't supposed to reorder
> even simple-tagged commands,

Are you sure about this ? The way I read SAM-3, all simple task queued
between 2 ordered task or head of queue task can complete in any order.
Even if the write cache is disabled.

> but if they are already sorted, the simple
> fact that the drive can complete many at once while new commands are still
> being submitted will still give most of the benefit of write caching
> without requiring WCE to be set, which is what causes problems requiring
> FUA in the first place.  Other tag reordering schemes let the drive sort
> simple-tagged requests with respect to the head position etc. while still
> treating ordered tags as barriers.
> SCSI disks generally *do not* ship with WCE turned on, because with sane
> host OSes there is little reason to do so.  Our I/O subsystem is only
> partially sane in this sense.
> And with WCE turned off, the drive isn't supposed to report commands as
> complete if the bits aren't on stable storage.
> The question really is, it seems to me, do we want "force this command
> to oxide now" _unconditionally_ or do we want "force this command to
> oxide before you let any previous commands hit oxide".  The latter seems
> much more elegant and flexible and also as if it is what consistency of
> the on-disk datastructures of the filesystem _should_ require, while the
> former seems stricter than what should be required, but as Bill pointed
> out, using only simple and ordered tags and the default tag reordering
> policy, the barrier this creates can end up actually impacting far more
> I/O than may be intended.

But I still can't see how using FUA can avoid using ordered tags ... you still
want to be sure the metadata have been written before the journal cleanup
is written (or make sure the journal entry is written before metadata
start being written). Or maybe Bill suggested using FUA for both
journal and metadata ? 

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference

Home | Main Index | Thread Index | Old Index