tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Exposing FUA as alternative to DIOCCACHESYNC for WAPBL



> The problem is that it does not always use SIMPLE and ORDERED tags in a
> way that would facilitate the use of ORDERED tags to enforce barriers.

Our scsipi layer actually never issues ORDERED tags right now as far
as I can see, and there is currently no interface to get it set for an
I/O.

> Also, that we may not know enough about the behavior of our filesystems
> in the real world to be 100% sure it's safe to set the other mode page
> bits that allow the drive to arbitrarily reorder SIMPLE commands (which
> under some conditions is necessary to match the performance of running
> with WCE set).

I lived under assumption that SIMPLE tagged commands could be and are
reordered by the controller/drive at will already, without setting any
other flags.

> When SCSI tagged queueing is used properly, it is not necessary to set WCE
> to get good write performance, and doing so is in fact harmful, since it
> allows the drive to return ORDERED commands as complete before any of the
> data for those or prior commands have actually been committed to stable
> storage.

This was what I meant when I said "even ordered tags couldn't avoid
the cache flushes". Using ORDERED tags doesn't provide on-media
integrity when WCE is set.

Now, it might be the case that the on-media integrity is not the
primary goal. Then flush is only a write barrier, not integrity
measure. In that case yes, ORDERED does keep the semantics (e.g.
earlier journal writes are written before later journal writes). It
does make stuff much easier to code, too - simply mark I/O as ORDERED
and fire, no need to explicitly wait for competition, and can drop e.g
journal locks faster.

I do think that it's important to concentrate on case where WCE is on,
since that is realistically what majority of systems run with.

Just for record, I can see these practical problems with ORDERED:
1. only available on SCSI, so still needs fallback barrier logic for
less awesome hw
2. Windows and Linux used to always use SIMPLE tags and wait for
completition; suggests this avenue may have been already explored and
found not interesting enough, or too buggy (remember scheduler
activations?)
3. bufq processing needs special care for MPSAFE SCSI drivers, to
prevent processing any further commands while I/O with ORDERED tag is
being submitted to the controller.

I still see my FUA efford as more direct replacement of the cache
flushes, for it keeps both the logical and on-media integrity. Also,
it will benefit the SATA disks too, once/if NCQ is integrated.

I think that implementing barrier/ORDERED can be parallel efford,
similar to the maxphys branch. I don't think barriers will make FUA
irrelevant, as its still needed for systems with WCE on.

Jaromir


Home | Main Index | Thread Index | Old Index