Subject: compat/linux getdent64() NFS problem (Was: Re: Linux emulation and NFS)
To: Matthias Scheler <tron@zhadum.de>
From: Jaromir Dolecek <jdolecek@netbsd.org>
List: tech-kern
Date: 05/24/2002 11:53:59
Hi,
so here is result of my findings so far:

The loop on NFS directory thing (or, currently, the fact that
only small part of directory is displayed) is actually
caused by interaction between how glibc uses the d_off value
and NFS cookies provided by our NFS code. This seems to only
be problem with linux_sys_getdents64(), since apparently glibc doesn't
use the d_off value in case getdents64() isn't available.

The cookies returned by our nfs_readdir() are in netowork byte
order, so instead of 12, 24, 36 ... the values provided in the
array are seen like 201326592, 402653184, etc on i386.  These values
are directly passed via d_off member of the structure returned to
linux userland. Now, glibc checks the d_off value and stops reading
once entry's d_off is smaller than previous entry. Due to byteswapping,
this eventually happens, and result is that glibc ends processing
the returned directory data prematurely.

Now, I'm not too sure what to do. The d_off _has to_ be valid
directory offset, glibc is using that in seekdir() and telldir(),
so it can't be fabricated e.g. as index value or something.
It has to be valid parameter for lseek(), too.

My opinion is that nfs_readdir() should byteswap the cookies as
appropriate. nfs_readdir() is called (via VOP_READDIR()) by client
code only, it's not used by the NFS server code as far as I can tell.
As far as I can tell, the other emulations using/setting d_off
should have this problem too. I've checked that if I convert
the cookies from network to host byte order, the problem with listing
NFS directory contents is gone (i.e., no loop and all entries
are displayed).

OTOH, I'm not sure if the cookie value in nfs_readdir() is guaranteed
(i.e. 'defined') to be real directory offset, or if it just happens
to be the offset. I assume that the glibc code _does_ work even
for NFS directories under Linux, so I assume Linux NFS code does
return real offset value in d_off, in host endianness.

My opinion is we should do the same, and the right place to change
is nfs_readdir(), not hacking linux_sys_getdents64().

Jaromir
-- 
Jaromir Dolecek <jdolecek@NetBSD.org> http://www.NetBSD.org/Ports/i386/ps2.html
-=- We should be mindful of the potential goal, but as the tantric    -=-
-=- Buddhist masters say, ``You may notice during meditation that you -=-
-=- sometimes levitate or glow.   Do not let this distract you.''     -=-