Subject: kern/34110: NFS client locks system if UDP is blocked
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <jmmv@netbsd.org>
List: netbsd-bugs
Date: 07/29/2006 09:05:01
>Number:         34110
>Category:       kern
>Synopsis:       NFS client locks system if UDP is blocked
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 29 09:05:00 +0000 2006
>Originator:     Julio M. Merino Vidal
>Release:        NetBSD 3.99.23
>Organization:
	
>Environment:
	
	
System: NetBSD dawn.home.network 3.99.23 NetBSD 3.99.23 (GENERIC) #22: Fri Jul 28 14:56:33 CEST 2006 root@max.home.network:/var/obj/usr/src-current/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
	I have a 3.0_STABLE machine serving multiple directories over NFS.
	This machine is using pf(4) to filter incoming connections and it
	blocks NFS UDP; i.e. it only allows NFS TCP.  In order to achieve
	this, it has the following rules:

	pass in on $iface inet proto udp to port rpcbind keep state
	pass in on $iface inet proto tcp to port rpcbind keep state
	pass in on $iface inet proto udp to port 1010:1024 keep state
	pass in on $iface inet proto tcp to port 1010:1024 keep state
	pass in on $iface inet proto tcp to port nfs keep state

	(The 1010:1024 port range is a big hack to let RPC in, but does the
	trick just fine.)

	Now, I did the following on a NetBSD 3.99.23 (as shown above) machine:

	# cd /media
	# mount max.home.network:/home/jmmv jmmv

	This command gets stalled because the UDP mount cannot succeed.
	I can stop it with CTRL+C, and it indeed disappears from the
	system tables, or at least it seems so from top(1) and ps(1)
	output.  Furthermore, mount(1) does not show any changes to the
	file system table.

	Unfortunately, at this point the system is already in a inconsistent
	state.  If I do a 'ls' over /media, the command gets stalled and
	cannot be killed.  Trying multiple commands over /media or /media/jmmv
	can make the situation worse and lock the whole system up (this only
	happened once, though).  By lock I mean that the VFS does not respond
	to any queries so I cannot execute any new command nor log in any new
	user.

	The only way out of this situation is to reboot the machine.  But it
	cannot be cleanly rebooted because the kernel hangs during the process
	at the 'unmounting file systems...' message.

	Mounting those NFS shares using TCP works perfectly fine.

>How-To-Repeat:
	See above.

>Fix:
	Unknown.  May it be that a lock is not properly released?

>Unformatted: