Subject: Re: VOP_BMAP question
To: None <tech-kern@netbsd.org>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: tech-kern
Date: 12/24/2003 01:24:19
On Tue, Dec 23, 2003 at 04:12:38PM -0800, Bill Studenmund wrote:
> On Tue, Dec 23, 2003 at 07:09:53PM +0100, Juergen Hannken-Illjes wrote:
> > On Tue, Dec 23, 2003 at 09:55:37AM -0800, Bill Studenmund wrote:
> > > On Fri, Dec 19, 2003 at 10:13:22PM +0100, Juergen Hannken-Illjes wrote:
> > > > How does VOP_BMAP() handle fragments?
> > > > 
> > > > Given a file with holes obtained from ftruncate(), what does VOP_BMAP()
> > > > return in its argument "bnp" if it finds a fragment?
> 
> Holes are usually a problem. VOP_BMAP() isn't good for triggering 
> allocation, which is what you need to fill a hole.
> 
> > > > Is it the block number of the fragment or will it return (daddr_t)-1?
> 
> It'll be the fragment's address.
> 
> > > > Is it always ok to write a full block to "bnp"?
> 
> If you mean ffs block, I don't think so. I think you have to know that you 
> won't be writing past the end of the file.

This is the restriction. The last block from VOP_BMAP may be a fragment if
the the last block exceeds the file's size. If it doesn't exceed the file's size
it is alway a full block.

> > > I think you've been bitten by an ffs ambiguity (since only ffs has 
> > > "fragments").
> > > 
> > > What ffs calls a fragment in its documentation (the 1k in an 8k/1k file 
> > > system) is what the kernel internally calls a block. Since VOP_BMAP() 
> > > deals with kernel things, a "fragment" is a block, so there is no problem.
> > 
> > So ufs_bmaparray() first sets "maxrun = MAXPHYS / mp->mnt_stat.f_iosize - 1"
> > which is "64k / 8k - 1 == 7" from example above. Then it computes "*runp" as
> > the number of 1k blocks (fragments) that are contiguous.

Here I was wrong, "*runp" is the number of (8k) blocks. From ufs_issequential():

	return (daddr0 + ump->um_seqinc == daddr1);

ump->um_seqinc is the number of fragments in a block. Now I am sure the snippet
from sys/dev/vnd.c is correct.

> > >From sys/dev/vnd.c:
> > 
> > 	bsize = vnd->sc_vp->v_mount->mnt_stat.f_iosize;
> > 	...
> > 	error = VOP_BMAP(vnd->sc_vp, bn / bsize, &vp, &nbn, &nra);
> > 	...
> > 	sz = (1 + nra) * bsize;
> > 
> > This looks like it would run on "blocks" instead of "fragments".
> 
> Having a run size of ffs blocks does not mean that the block number 
> returned is also in units of ffs blocks.
> ffs will read f_iosize blobs up until the end of the file, so it's 
> appropriate for sz to be in f_iosize blobs.
> 
> I've been quite confused by the code, so I'm not really sure if the bn / 
> bsize is right; it might really need to be bn / f_bsize (which is 
> "fragment" size.
> 
> Take care,
> 
> Bill



-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)