Subject: Re: lseek() extension for spare files
To: None <>
From: Bill Studenmund <>
List: tech-kern
Date: 09/21/2006 14:12:19
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Sep 21, 2006 at 10:20:51PM +0200, Reinoud Zandijk wrote:
> Dear Bill,
> On Thu, Sep 21, 2006 at 11:19:14AM -0700, Bill Studenmund wrote:
> > > It adds the SEEK_DATA and SEEK_HOLE `whence' arguments to lseek(). Fo=
r a=20
> > > more detailed look see the solaris 10 man page :
> > >=20
> > >
> > >=20
> > > Although its pretty complete and smoothed out it might need some fine=
> > > tuning. Note that in this patch no file system has yet implemented sp=
> > > area reporting and the genfs implementation, that allmost all use,=20
> > > implements the basic functionality.
> >=20
> > How will we implement the guts of SEEK_DATA and SEEK_HOLE? If we've han=
> > the whole VOP off to genfs, we don't have an fs-specific callback.
> >=20
> > I think what we should do is have each fs have its own seek routine, wh=
> > calls a genfs routine with the vop info plus a callback to handle findi=
> > regions. Either that, or to the extent we have genfs-ops, we need anoth=
> > one.
> As discussed earlier i've moved the implementation of the `whence' to the=
> file system vnode. To ease the implementation i've passed the relevant da=

Ok, the text was a bit confusing.

> down stream and all file systems that don't support sparse files natively=
> like msdosfs just use the genfs_seek function. It will provide a=20
> minimalistic but correct behaviour for normal seeks and for data/hole=20
> searches.

I'm not sure you have the SEEK_DATA case right:

+               if (ap->a_offset !=3D 0)
+                       return ENXIO;

My read of the manpage is that if you're on top of data, there is a data=20
range starting at your offset. Note the use of "greater or equal" in the=20
Sun man page.

I think that should be that as long as the offset is not at or past the
end of the file, there's data, so no error.

> Each file system that does want to provide sparcity information can eithe=
> implement it as a seek for zero's (not really encouraged but never the le=

No. We should not implement this as a seek for zeros. Either you look at=20
your allocation tables to find holes, or you don't find holes.

> possible) or seek its datastructures and give the offsets asked for. One=
> call can only return one offset so no lists have to be made.
> Note that the interface definition in the Solaris manpage does not state=
> that data area's don't have zero regions nor that holes have a minimal=20
> size.
> I think generally speaking for most file systems holes are defined in ter=
> of sector size blocks anyway and can be searched for in its own way.

Holes are unallocated areas of files, so they by definition are sized=20
according to the block allocation policy of the file system.

> If i understand FFS enough, it could be that the file's allocation tree c=
> be searched and the relevant region extracted at allmost no cost; esp. if=
> you count the reading and movement of all that dummy data to userland.

It won't be "at almost no cost," however it will be MUCH cheaper than
reading data. It will be expensive as indirect blocks will have to be read
into the kernel, and that usually means seeks.

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.4.3 (NetBSD)