Subject: Re: filingsystem/UBC interface
To: None <tech-kern@netbsd.org>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 09/02/2005 08:56:50
On Fri, Sep 02, 2005 at 04:40:58PM +0200, Reinoud Zandijk wrote:
> Dear folks,
> 
> scouting filingsytem for details about filingsystem/UBC interaction puzzles 
> me. There seem to be several methods used for VOP_READ/VOP_WRITE 
> interaction to UBC:
> 
> - no visible interface, like smbfs, ntfs
> - explicit ubc code snipped like used for ffs, nfs, cd9660, filecore, 
>   msdosfs, adosfs:
> 
>       win = ubc_alloc(&vp->v_uobj, uio->uio_offset, &bytelen, UBC_READ);
>       error = uiomove(win, bytelen, uio);
>       flags = UBC_WANT_UNMAP(vp) ? UBC_UNMAP : 0;
>       ubc_release(win, flags);
> 
> Would that mean that smbfs and ntfs do not use the UBC? or does that simply 
> mean they don't use the mmap semantics of the vop_getpages, vop_strategy 
> path but rather use the older read/write buffers?

ntfs and smbfs have not been converted to use the UBC.


> from the man-page:
>      ubc_alloc() creates a kernel mappings of uobj starting at offset offset.
>      the desired length of the mapping is pointed to by lenp, but the actual
>      mapping may be smaller than this.  lenp is updated to contain the actual
>      length mapped.  The flags must be one of
>      ...
> 
>      ubc_alloc(struct uvm_object *uobj, voff_t offset, vsize_t *lenp, int
>      flags);
> 
> Can i to force a certain buffer length to be passed in calls to 
> VOP_STRATEGY() use this parameter `*lenp' passed to ubc_alloc() ? or is it 
> just trunced to the nearest page size?

that argument to ubc_alloc() has nothing to do with the sizes of the
I/O requests sent down to the underlying device driver, it's used to let
the caller know about the size of the return virtual mapping of the pages.


the way to control the size of the I/O requests that the genfs code will
generate is to set your file system's struct mount field set "mnt_fs_bshift"
to the log2 of the block size of your file system.  all I/O issued will be
multiples of that size.  file systems that have fragments (and thus shouldn't
always read or write whole blocks) can use a custom gop_size callback to
deal with that.

-Chuck