Subject: kern/5960: NFS mounts hang on multi-homed clients
To: None <gnats-bugs@gnats.netbsd.org>
From: Jason R Thorpe <thorpej@nas.nasa.gov>
List: netbsd-bugs
Date: 08/12/1998 16:17:11
>Number:         5960
>Category:       kern
>Synopsis:       NFS mounts hang on multi-homed clients
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 12 17:35:00 1998
>Last-Modified:
>Originator:     
>Organization:
Numerical Aerospace Simulation Facility - NASA Ames
>Release:        NetBSD 1.3G, Wed Aug 12 16:10:20 PDT 1998
>Environment:
	
System: NetBSD dracul 1.3G NetBSD 1.3G (DRACUL) #696: Wed Aug 12 12:46:50 PDT 1998 thorpej@dracul:/u5/netbsd/src/sys/arch/i386/compile/DRACUL i386


>Description:
	If I connect an AlphaStation 500 and a PPro 200, both running
	same vintage NetBSD, the PPro reliably hangs on an NFS mount.
	The AlphaStation will also do the same, but less reliably.

	The trace of the wedged processes all point to NFS:

	For a stuck ksh, thorpej's uid, attempting to run "ttcp":

		_sbwait
		_soreceive
		_nfs_receive
		_nfs_reply
		_nfs_request
		_nfs_getattr
		_nfs_lookup
		_lookup
		_namei
		_sys___stat13

	For a stuch csh, when I logged in as root on the console to
	reboot the system:

		_nfs_rcvlock
		_nfs_reply
		_nfs_reqest
		_nfs_lookup
		_lookup
		_namei
		_vn_open
		_sys_open

	I don't know, nor do I want to know, the NFS code.  I will,
	however, be happy to dig for whatever other information
	Frank asks for :-)

>How-To-Repeat:
	Multi-home an NFS client.  In the PPro case, de0 is the main
	connection, de1 is unused, and fxp0 is configured as 10.0.0.1
	connected back-to-back to an Alphastation with an fxp0 configured
	as 10.0.0.2.  The problem seems to happen only after packets
	have traversed the network-10 interfaces.

>Fix:
	None supplied.
>Audit-Trail:
>Unformatted: