[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: DIRBLKSIZ differs between userland & kernel
On Mon, Feb 27, 2012 at 10:55:48AM +1030, Brett Lymn wrote:
> I have been tracking through a bug with Coda where, basically, getdents(2)
> is not returning all the directory entries. The files exist in the
> directory but do not show up on a globbed listing. After some digging I
> found that things ended up in ufs_readdir() which was terminating early
> due to a bad dirent.
> What is happening is the userland part of Coda, venus, manufactures a
> the dirents for a directory when a read request comes up from the
> kernel. It has code to carefully avoid a dirent spanning a DIRBLKSIZ
> boundary by padding a dirent near the boundary. This data is returned
> to the kernel for processing. Where things come unstuck is DIRBLKSIZ is
> defined in /usr/include/dirent.h as 1024 bytes, inside the kernel ufs
> code sys/ufs/ufs/dir.h DIRBLKSIZE is set to DEV_BSIZE which is 512
> bytes. This means that venus can produce a block of dirents that finish
> before the 512 byte mark, the kernel code tries to align back to a 512
> byte boundary and fails to find a valid dirent - the fact that ufs_readdir()
> exits gracefully rather than causing a panic is more by luck than
> anything else.
Ugh, what a mess.
> To fix it I think I have the following choices:
> 1) Patch venus so it ignores the userland DIRBLKSIZ and, instead, uses
> DEV_BSIZE (if available) or just hard code in 512 bytes as the dirent
> block size.
> 2) Change DIRBLKSIZ in dirent.h so it matches the kernel
> 3) Fix mount_coda so it updates the um_dirblksiz to match userland.
I don't think any of those is the right answer. Coda is not limited to
running on top of ffs, so it shouldn't be doing only filesystem-
independent things when talking to the filesystem it uses for storage.
Therefore it should be using the value from <dirent.h> in both the
kernel and in venus. If it's running on top of ffs, ffs will provide
dirents with padding at 512-byte intervals that it would think
unnecessary, but I would think it shouldn't notice or care.
Then again, maybe I don't understand what's going on, as there
shouldn't be any way for ufs_readdir to see, much less trip on,
dirents generated by venus.
Note that ffs needs DIRBLKSIZ to be the same as the underlying atomic
I/O size, or various unspecified bad things can happen in crashes. So
you can't change what ffs is doing.
Also, I have no idea why the userland value diverged from the ffs
value, but I doubt it's safe to change it without adding a large pile
of compat wibble.
Finally, we should not not not have duplicate symbol names like this.
I guess now that we've branched I can go clean it up...
David A. Holland
Main Index |
Thread Index |