Subject: kern/13633: NFS Problems With NetBSD-1.5.1/Alpha
To: None <>
From: None <>
List: netbsd-bugs
Date: 08/05/2001 16:52:18
>Number:         13633
>Category:       kern
>Synopsis:       NFS Client-side problems with NetBSD-1.5.1/Alpha
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 05 16:49:00 PDT 2001
>Originator:     Bill Dorsey
>Release:        NetBSD 1.5.1
Personal Workstation, NetBSD 1.5.1, Alpha
System: NetBSD spitfire 1.5.1 NetBSD 1.5.1 (SPITFIRE) #0: Tue Jul 24 00:59:25 PDT 2001 alpha

When a NetBSD machine mounts a NFS filesystem exported by a
NetBSD-1.5.1/Alpha machine, problems occur when working with
directories that contain large number of files.  This can be
illustrated by using the find(1) command to search through a
large hierarchy.  It will begin to search through the
hierarchy and then numerous failure messages will appear
saying, "./<some_name>: No such file or directory."  A similar
failure mode can be experienced when using the ls(1) command
with the "-alR" arguments.

Additionally, sometimes if one uses the ls(1) command with the
"-ali" arguments, one can see the "." and ".." directories have
the same inode number.  This is probably also related to the
pwd(1) command printing out current working directories with
a lot of "./././././././" embedded in the output.

I have verified the following configurations reproduce this

NFS Client		NFS Server		Problem
NetBSD-1.5.1/Alpha	NetBSD-1.5.1/Alpha	    Yes
NetBSD-1.5.1/Alpha	NetBSD-1.5/Alpha            Yes
NetBSD-1.5/Alpha	NetBSD-1.5.1/Alpha           No
NetBSD-1.4.2/i386	NetBSD-1.5.1/Alpha	    Yes
Solaris 7/Sparc		NetBSD-1.5.1/Alpha           No
NetBSD-1.5.1/Alpha	Solaris 7/Sparc		     No

I ran a ktrace(1) on a find(1) command from one of the configurations
listed above and noted that right after a chdir("..") system
call, a stat(2) call was failing on a file that was expected to
exist based on an earlier read of the parent directory.  This is
probably explainable if the nfs code returns the same inode number
for '..' as for '.'.

The kern/11618 bug appears very similar to this (and may in fact
be the same bug), but I have not seen this with versions of the
kernel prior to 1.5.1.  Significant changes were made in the
kernel NFS code for release 1.5.1 which may be related to this
problem so I have created a new bug rather than adding to the
old one.


On a NetBSD-1.5[.1]/Alpha machine, export a filesystem that contains
directories with large number of files/subdirectories (I originally
experienced the problem with a gcc-2.95.3 build tree).  Now mount
this filesystem using as a NFS client one of the configurations
listed in the table above as exhibiting the problem.  Now change
directories to the mounted volume and traverse through the filesystem
using the "ls -alR" command or a "find . -name 'foo' -print" command.
The commands should begin to fail after displaying a handful of files.


Don't use an Alpha-powered machine running NetBSD-1.5[.1] for your
NFS server.