Subject: Re: Extension of fsync_range() to permit forcing disk cache
To: Manuel Bouyer <firstname.lastname@example.org>
From: Bill Sommerfeld <email@example.com>
Date: 12/20/2004 11:33:43
On Fri, 2004-12-17 at 08:34, Manuel Bouyer wrote:
> In another thread, we admit that the upper layers needs to aware of this
> property of the ATA drives, and deal with it. fsync() doens't have to
> call directly the flush cache ioctl, it could insert itself in the
> write queue with a write barrier. This way several subsystem's
> barrier could be combined in one to help performances on busy systems.
while a queued barrier can help with single-system self-consistency, you
still may wind up turning back the clock after a crash unless you also
put other externally-visible signs from the processes into limbo until
the write completes..
consider an SMTP server -- before responding with a 2xx code to the DATA command,
it should commit the message to stable store.
all of this is, however, something of a probability game; there are no absolute
guarantees that the data is coming back, because any block written could go bad,
or the controller could go fail in a way that it would acknowledge writes as
complete even when they weren't (I've seen it happen. story for another time..)
What makes sense for a low end ATA drive (it may lose dirty cache data on
loss of power or reset) may not make sense for a high end RAID system with
battery backed mirrored cache.
which fsync semantic do you want:
- data recoverable even if something outside the storage widget fails
- data recoverable unless something inside the storage widget fails.