On Thu, Oct 30, 2008 at 07:08:54PM -0500, David Young wrote: > On Thu, Oct 30, 2008 at 01:28:21PM -0700, Bill Stouder-Studenmund wrote: > > On Wed, Oct 29, 2008 at 05:51:09PM -0400, Thor Lancelot Simon wrote: > > > > The problem is that this won't help. Ordered tags will relate the > > sequencing of commands relative to each other. The journal, however, > > doesn't care about the relative ordering of operations, it wants to know > > when the writes to the journal have hit stable storage. > > Bill, > > That is because the journal writes must hit stable storage before the > corresponding inodes are updated, right? Yes. It's a two-stage commit. An operation doesn't happen if it's not in the journal. If it's in the journal, we're free to write the blocks as we see fit. Once all the blocks are written, we are free to overwrite that part of the journal. > > The key problem is that, on SCSI disks with the write cache enabled, a > > write command can complete by writing to the cache. > > > > The journal needs a way to turn the cache off for one operation. That's > > what FUA is. That's what we need. > > Does FUA necessarily turn off the cache for that operation, or does it > tell the SCSI disk to wait to indicate command completion until after > the write is flushed from cache to stable storage? It seems that letting > the disk flush a write on its own schedule is desirable, unless the host > has a backlog of disk ops that wait for the FUA write to complete. It means _F_orce _U_nit _A_ccess. Don't report completion until after the write has hit stable storage (battery backed up ram is ok). For a read, don't read from cache, but make the heads read the media. Yes, the disk can handle FUA writes when its scheduler sees fit. Take care, Bill
Attachment:
pgpeHX0p7m4YA.pgp
Description: PGP signature