Subject: Re: RFC (reassign)buf and carvinf up buffers (was Re: SCSI MMC device abstraction and UDF patch for review)
To: Reinoud Zandijk <reinoud@netbsd.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 12/29/2005 09:58:51
--qDbXVdCdHGoSgWSk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Dec 29, 2005 at 03:17:58PM +0100, Reinoud Zandijk wrote:
> Dear Bill,
>=20
> On Wed, Dec 28, 2005 at 09:45:54PM -0800, Bill Studenmund wrote:
> > > I'll try out the reassign though i wonder what problems and complexit=
y=20
> > > might arise from mismatched vop_strategy() buffer sizes and disc logi=
cl=20
> > > block sizes. vop_strategy() now requests buffers upto say 64 kb/piece=
 and=20
> > > the logical block size might be 2kb. Each part of 2kb can/could be st=
ored=20
> > > somewhere else on disc. Normally a VOP_BMAP could determine the exten=
t it=20
> > > could take in one go but looking up such information can be costly.
> >=20
> > vop_strategy() makes requests sized as the file system wants. To be=20
> > perfectly honest, if your 2k logical blocks are not next to each other,=
=20
> > you shouldn't be issuing a 64k request. Or at least it should be broken=
 up=20
> > before the vop_strategy() level.
>=20
> That implies having a VOP_BMAP() figuring this out. Since UDF can't use a=
=20
> VOP_BMAP this way (due to write shuffling) it would mean that VOP_BMAP=20
> needs to distinguish between read and write requests and for read-request=
=20
> try to figure out how much it can read in one go... quite expensive and=
=20
> locking trouble prone.

This does not imply VOP_BMAP() figuring this out.

The file system decides what data goes into what buffers. The file system=
=20
knows what blocks are where. Thus you don't have to figure all of this out=
=20
in the middle of your strategy routine, you can figure it out when you=20
make the buffers in the first place.

More directly, you SHOULD figure it out before your strategy routine.

> > As such, I don't think we need this support to do what you suggest. I c=
an
> > think of other reasons I would like it, but the file system should only=
 be
> > issuing requests (to this level of the kernel) that can make it to the
> > disk in contiguous blocks.
>=20
> Remember, UDF (and LFS AFAIK) uses the vnode's of the filingsystem to=20
> buffer its data. A VOP_STRATEGY() read request of an extent is thus an=20
> extent read/write on the file/vnode and has nothing to do with disc mappi=
ng=20
> IMHO. Disc sheduling is done at the device node anyway; not on the=20
> file/vnode's node. At device level i agree with you though.

No, a VOP_STRATEGY() call does NOT represent a read/write that has nothing=
=20
to do with disk mapping, it represents a read or write of a buffer. Said=20
buffer represents an extent on disk. One extent. If you have multiple=20
extents in your transfer, you are dealing with multiple buffers.

Take care,

Bill

--qDbXVdCdHGoSgWSk
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFDtCPbWz+3JHUci9cRAsdJAJ9HW8HDyDTuAt+K5VtreB4xQXInuACfXz+k
at8ABlSJxJRFfuSaW0c9iVU=
=sIVI
-----END PGP SIGNATURE-----

--qDbXVdCdHGoSgWSk--