Subject: Re: problem with tmpfs and linux emulation?
To: Michael van Elst <mlelstv@serpens.de>
From: Steven M. Bellovin <smb@cs.columbia.edu>
List: current-users
Date: 11/13/2005 19:49:38
In message <dl8j56$amu$2@serpens.de>, Michael van Elst writes:
>pcah8322@artax.karlin.mff.cuni.cz (Pavel Cahyna) writes:
>
>>On Sun, Nov 13, 2005 at 09:45:14PM +0100, Michael van Elst wrote:
>>> When d_off is not a valid offset, the llseek fails and the next
>>> iteration may or may not read more directory entries.
>
>>How can d_off be invalid? It was returned by the kernel before, so it
>>should be valid, no?
>
>With tmpfs (and NFS) d_off is an opaque cookie and does
>not resemble an offset into the "directory file".
>

So we now understand the problem: some of our file systems violate 
assumptions that the linux library makes.  The question is what to do.

The problem, as documented in Michael Elst's post (see
http://mail-index.netbsd.org/current-users/2005/11/13/0011.html ),
is that the Linux getdents() routine not unreasonably relies on Linux 
kernel semantics.  (Their man page warns you that that isn't the proper 
interface, and points people at readdir(3); readdir(3) warns that using 
any non-Posix structure entries isn't a good idea.  In other words, 
they're playing absolutely clean from a standards perspective.)

It seems likely that nothing will make this work short of a real byte 
offset in d_off.  This in turn suggests that either the linux emulation 
layer has to have a way of knowing if it needs to do a conversion, or 
the file system layer needs to do the conversion, possibly as a result 
of a mount-time flag.  I'd really rather avoid the latter if possible 
-- how expensive would it be for tmpfs to maintain a real byte count?  
(Seeking to that point need not be cheap; it's an infrequent operation, 
I suspect.)

		--Steven M. Bellovin, http://www.cs.columbia.edu/~smb