tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: WAPBL/cache flush and mfi(4)

On Fri, Aug 24, 2012 at 05:28:15PM -0500, David Young wrote:
> > So you still need to flush the controller's cache before disks caches,
> > otherwise data can migrate from safe storage to unsafe one.
> Will a controller really empty its cache into the attached disks'
> caches, or will it issue the disk writes, wait for the disks to
> acknowledge that the data is on the platter, and then empty the cache?

I guess that when the "disk write cache" parameter is set to enabled in the
firmware, the firmware will write to the disk's caches not platters.
Otherwise this setting wouldn't make sense.

> I have the following vague idea in mind for how an operating system
> should treat disk writes: it seems to me that our disks subsystem(s)
> should treat streams of disk writes kind of like TCP sessions in
> that the "receiver", which is either an instance of some disk driver
> (e.g., sd(4)) or a non-volatile cache, tells the "sender" (some user
> process that write(2)s, a filesystem, or the pager) that it is open to
> receive up to X megabytes.  The sender sends the receiver X-megabytes'
> worth of bufs, but holds onto a copy of the bufs itself until each is
> acknowledged.  Ordinarily an acknowledgement will come back saying "you
> may go ahead and send me Y more kilobytes, sender".  A sender may also
> get a NACK ("sorry, the backup disk was unplugged before it acknowledged
> that buffers P, Q, and R hit the media"); then it has to indicate the
> exception or else retransmit the buffers.
> Here and there in the system you will have software (a filesystem) or
> hardware (a battery-backed cache) that "proxies" disk-write streams.  A
> filesystem will "proxy" because it's probably going to either serialize
> writes (say to write them to a journal) or to augment them (say to
> update corresponding metadata).  Typically a filesystem will proxy, too,
> because we don't expect for a user process to block in write(2) until
> all the bytes written have landed on the platter.  A battery-backed
> cache will proxy because it's going to guarantee disk-write completion
> to the sender.
> I have the following doubt about a battery-backed cache: what if I
> yank the disk?  I have never met a controller with battery-backed
> cache where I could not pull some of the disks right out of the front
> of the chassis.  I guess that usually those disks were redundant,
> too.  So, what if I yank two disks? :-) It seems like receivers and
> proxy receivers ought to advertise the guarantees that they do and do
> not make (e.g., "I guarantee that barring disk-yankage, I will put
> your bytes on the platter" OR "barring power failure or disk-yankage
> and non-replacement, I will put your bytes on the platter"), and
> senders requirements ought to be matched to receivers guarantees when a
> disk-write session is established.

I've never seen these properties explicitely explained in documentation.
I'm not sure a controller like mfi(4) has a way to report a write
error to the driver when doing a write back (maybe it will report a
SCSI deffered error sense, but I'm not sure).
Note that you don't have a way to know when data have been moved from the
write-back cache to disks.

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference

Home | Main Index | Thread Index | Old Index