Subject: Re: Extension of fsync_range() to permit forcing disk cache flushing
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 12/17/2004 11:15:30
--qlTNgmc+xy1dBmNv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Dec 17, 2004 at 02:34:08PM +0100, Manuel Bouyer wrote:
> On Fri, Dec 17, 2004 at 07:24:46AM -0500, Thor Lancelot Simon wrote:
> > On Fri, Dec 17, 2004 at 10:08:48AM +0100, Manuel Bouyer wrote:
> >=20
> > The problem is that syncing 1K of data from one file could cause an
> > entire 8MB cache to be written back (in fact, on an IDE disk, *will*
> > cause that).
>=20
> In another thread, we admit that the upper layers needs to aware of this
> property of the ATA drives, and deal with it. fsync() doens't have to
> call directly the flush cache ioctl, it could insert itself in the
> write queue with a write barrier. This way several subsystem's
> barrier could be combined in one to help performances on busy systems.

I don't see how a write barrier will ensure that the data have left the=20
cache. All it will do is ensure that the drive's reported everything=20
before as finished.

This isn't just a property of ATA drives. SCSI drives also have write=20
caches, and they can be very large.

> > Some applications "defensively" call fsync on every write;
> > think what that will do to overall system performance.
>=20
> Then maybe these applications need to be fixed ?
> Adding a flag to fsync() to make it do what we expect from it, just becau=
se
> some application don't use it in a good way seems backward to me.

But how do we know it won't do what we expect? Some disks have the write=20
cache battery-backed, so it is as good as disk. Some disks will have the=20
cache off. Some disks will have the disk on, but be in a battery-backed=20
enclosure, so that the cache is safe even though it looks unsafe.

Knowing what's "safe" is a local policy decision. I think we should leave=
=20
it that way.

Take care,

Bill

--qlTNgmc+xy1dBmNv
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFBwzBSWz+3JHUci9cRAo8SAJ9JbnGJvOg/BgtwbKYXP8G6OBq8HwCeJAul
B3QEVDGNSdF6xi+VLwWjTgY=
=Shgt
-----END PGP SIGNATURE-----

--qlTNgmc+xy1dBmNv--