[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Proposal: B_ARRIER (addresses wapbl performance?)
On Tue, Dec 09, 2008 at 09:39:01PM +0100, Joerg Sonnenberger wrote:
> On Tue, Dec 09, 2008 at 12:12:32PM -0800, Jason Thorpe wrote:
> > You want your journal to remain self-consistent, otherwise you can't
> > trust it to replay it.
> If each journal entry is checksummed and the it is ensured that old data
> is not valid, you can write the journal async as long as the journal
> hits the disk before the corresponding meta data. The minimal
> constraints are:
> (1) Newly allocated blocks are either in the journal or are written
> before the corresponding journal entry. (*)
> (2) Blocks in the journal are written after the journal entry and all
> previous journal blocks have been written to stable storage.
> (*) is currently not done. Depending on how to ensure that old data is
> not valid anymore, zeroing of the journal could be used and have
> similiar constraints.
> The question is how this can be mapped to the physical constraints. For
> ATA, it can be effectively only done using a flush after (1) and before
> (2), but the latter doesn't have to be done after each journal entry,
> just when flushing the journal content to disk.
I don't get what you mean with "flushing the journal content to disk".
More exactly, I don't understand why it doesn't have to be done each time
we're going to write blocks. Without it, you can end up with blocks being
written before the corresponding journal entry, isn't it ?
> How does this look for more intelligent devices?
Either with the write cache disabled and doing the barrier in software
(i.e. waiting for command completion before sending the next commands, up
to the next write barrier), or using an ordered queue tag for journal
writes (the issue with the latter being, as already pointed out, that
it enforce the barrier for all commads, including those unrelated
to this filesystem).
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
Main Index |
Thread Index |