Subject: Re: Smoother writing for LFS
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 10/23/2006 19:17:19
--1EKig6ypoSyM7jaD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Oct 23, 2006 at 07:06:15PM -0400, Thor Lancelot Simon wrote:
> I've been thinking a bit about smoothing out the write bursts caused by
> the interaction between LFS and our current "smooth" syncer.  I think I
> might have a fairly simple solution.
>=20
> 1) Rather than a global estimate, maintain an estimate per-filesystem
>    of the current number of dirty pages.  I'm not sure how hard this
>    would be, and would appreciate feedback.
>=20
> 2) Maintain, per filesystem, minimum and maximum "target write sizes".
>=20
> 3) Once per second, traverse the list of filesystems, and for any
>    filesystem with more than the minimum outstanding, clean until there's
>    nothing left or we hit the maximum.
>=20
> The sizes in #2 would also be useful for teaching NFS server write
> gathering that LFS prefers to write a minimum of one segment at a time.
>=20
> It is easy for LFS to track the current write bandwidth of the disk, so
> we could set the maximum size so that the disk is never more than X% busy
> with background writes in any given second.
>=20
> The only problem is this: we don't have any way to track (or clean) only
> the set of pages whose backing store is on a particular filesystem.  And
> I don't know what a good interface for that might look like, or what it
> would cost -- gain, this is an area where I'd appreciate suggestions.

I think it would make sense to track dirty pages per file system. Hang the=
=20
lists off of the mount point rather than having a global list. Since we=20
know the owning vnode, we know the owning mount.

I like the idea you describe of having a certain amount of i/o outstanding=
=20
to disks all the time, and per-mount queues are the easiest way I see of=20
knowing which blocks need to go to which devices.

> However, we could quite possibly implement this first for metadata
> buffers, where it might address some of the issues with our syncfs by
> reducing the amount of outstanding data it handles.

I'm curious what is so different between our syncfs and FreeBSD's. Or=20
between ours and where Free's was back when we got softdeps. They don't=20
have these problems AFAIK.

Take care,

Bill

--1EKig6ypoSyM7jaD
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (NetBSD)

iD8DBQFFPXevWz+3JHUci9cRAn1kAJ9E5gU5/aYTAP5aafyfx/ZIKpHpCQCeOMpS
8psq4Lyh6Dq/+bk9tQNxUrM=
=Kz8I
-----END PGP SIGNATURE-----

--1EKig6ypoSyM7jaD--