Subject: Re: Extension of fsync_range() to permit forcing disk cache flushing
To: Bill Studenmund <email@example.com>
From: Manuel Bouyer <firstname.lastname@example.org>
Date: 12/17/2004 20:52:17
On Fri, Dec 17, 2004 at 11:15:30AM -0800, Bill Studenmund wrote:
> > In another thread, we admit that the upper layers needs to aware of this
> > property of the ATA drives, and deal with it. fsync() doens't have to
> > call directly the flush cache ioctl, it could insert itself in the
> > write queue with a write barrier. This way several subsystem's
> > barrier could be combined in one to help performances on busy systems.
> I don't see how a write barrier will ensure that the data have left the
> cache. All it will do is ensure that the drive's reported everything
> before as finished.
I meant a write barrier that ensure than the data are on stable storage,
as filesystems also wants this.
> This isn't just a property of ATA drives. SCSI drives also have write
> caches, and they can be very large.
> > > Some applications "defensively" call fsync on every write;
> > > think what that will do to overall system performance.
> > Then maybe these applications need to be fixed ?
> > Adding a flag to fsync() to make it do what we expect from it, just because
> > some application don't use it in a good way seems backward to me.
> But how do we know it won't do what we expect? Some disks have the write
> cache battery-backed, so it is as good as disk. Some disks will have the
> cache off. Some disks will have the disk on, but be in a battery-backed
> enclosure, so that the cache is safe even though it looks unsafe.
> Knowing what's "safe" is a local policy decision. I think we should leave
> it that way.
Sure, and then the administrator should be able to tune the behavior of
fsync(), filesystems, etc ... via configuration, depending on the underlying
hardware and the level of security he wants. A flag to the fsync system call
itself is not appropriate for this.
Manuel Bouyer <email@example.com>
NetBSD: 26 ans d'experience feront toujours la difference