Subject: Re: RFC: VOP_BMAP() change proposal
To: None <tech-kern@netbsd.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 12/30/2005 10:45:43
--uAKRQypu60I7Lcqm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Dec 30, 2005 at 03:22:34AM +0100, Reinoud Zandijk wrote:
> Dear folks,
>=20
> VOP_BMAP is currently defined as:
>      int
>      VOP_BMAP(struct vnode *vp, daddr_t bn, struct vnode **vpp,
>               daddr_t *bnp, int *runp);
>=20
> My problem with this is that it doesn't signal whether its a _read_ or a=
=20
> _write_ request. Some filingsystems with a preference for shuffling block=
s=20
> like UDF and also LFS might want to rewrite blocks on a different place=
=20
> than it reads them from. So when f.e. genfs() calls BMAP on an extent it=
=20
> gets both a wrong mapping but most of all a wrong _runlength_. A fragment=
ed=20
> file that is written out in a sequential part is thus splitted up in=20
> seperate sectors just because it was fragmented on the disc... :-S this=
=20
> logic is baffling me.
>=20
> A simple solution would be to pass a flag to the BMAP indicating reading =
or=20
> writing so filingsystems that want to distinguish between the two can do=
=20
> so.
>=20
> Thoughs?

I think you're trying to solve the problem the wrong way. You should not
schedule writes in VOP_BMAP(), thus you don't need to know if you're
mapping for write or read.

Do more what LFS does. It faces the exact same issues you are, and yet=20
it's fine with the existing VOP_BMAP(). :-)

Put another way, doing this in VOP_BMAP() is reactive. You should be=20
_proactively_ scheduling writes. If you do all of the scheduling in=20
VOP_BMAP(), you can only schedule the operation you are asked about. As=20
you note above, it's harder to consolidate. Also, you rely on genfs=20
scheudling the writes; if a buffer doesn't get requested for write, you=20
don't get to consolidate it with others.

Doing something higher up will let you have access to all the data for a=20
file, and let you manipulate the writes much better.

Take care,

Bill

--uAKRQypu60I7Lcqm
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFDtYBXWz+3JHUci9cRAjzeAJ0Vt+9BJPgeGtyICf5Xi0KMK9vmIgCgg9O1
GiCf9YtxUzm5FscToGpeApQ=
=g7DB
-----END PGP SIGNATURE-----

--uAKRQypu60I7Lcqm--