tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: DIRBLKSIZ differs between userland & kernel

On Mon, Feb 27, 2012 at 10:55:48AM +1030, Brett Lymn wrote:
 > I have been tracking through a bug with Coda where, basically, getdents(2)
 > is not returning all the directory entries.  The files exist in the
 > directory but do not show up on a globbed listing.  After some digging I
 > found that things ended up in ufs_readdir() which was terminating early
 > due to a bad dirent.
 > What is happening is the userland part of Coda, venus, manufactures a
 > the dirents for a directory when a read request comes up from the
 > kernel.  It has code to carefully avoid a dirent spanning a DIRBLKSIZ
 > boundary by padding a dirent near the boundary.  This data is returned
 > to the kernel for processing.  Where things come unstuck is DIRBLKSIZ is
 > defined in /usr/include/dirent.h as 1024 bytes, inside the kernel ufs
 > code sys/ufs/ufs/dir.h DIRBLKSIZE is set to DEV_BSIZE which is 512
 > bytes.  This means that venus can produce a block of dirents that finish
 > before the 512 byte mark, the kernel code tries to align back to a 512
 > byte boundary and fails to find a valid dirent - the fact that ufs_readdir()
 > exits gracefully rather than causing a panic is more by luck than
 > anything else.

Ugh, what a mess.

 > To fix it I think I have the following choices:
 > 1) Patch venus so it ignores the userland DIRBLKSIZ and, instead, uses
 > DEV_BSIZE (if available) or just hard code in 512 bytes as the dirent
 > block size.
 > 2) Change DIRBLKSIZ in dirent.h so it matches the kernel
 > 3) Fix mount_coda so it updates the um_dirblksiz to match userland.

I don't think any of those is the right answer. Coda is not limited to
running on top of ffs, so it shouldn't be doing only filesystem-
independent things when talking to the filesystem it uses for storage.

Therefore it should be using the value from <dirent.h> in both the
kernel and in venus. If it's running on top of ffs, ffs will provide
dirents with padding at 512-byte intervals that it would think
unnecessary, but I would think it shouldn't notice or care.

Then again, maybe I don't understand what's going on, as there
shouldn't be any way for ufs_readdir to see, much less trip on,
dirents generated by venus.

Note that ffs needs DIRBLKSIZ to be the same as the underlying atomic
I/O size, or various unspecified bad things can happen in crashes. So
you can't change what ffs is doing.

Also, I have no idea why the userland value diverged from the ffs
value, but I doubt it's safe to change it without adding a large pile
of compat wibble.

Finally, we should not not not have duplicate symbol names like this.
I guess now that we've branched I can go clean it up...

David A. Holland

Home | Main Index | Thread Index | Old Index