Subject: kern/3579: "root on nfs type ?" config panics finding root
To: None <gnats-bugs@gnats.netbsd.org>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: netbsd-bugs
Date: 05/05/1997 14:54:38
>Number:         3579
>Category:       kern
>Synopsis:       "root on nfs type ?" config panics finding root
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May  5 15:05:00 1997
>Last-Modified:
>Originator:     Jonathan Stone
>Organization:
	
>Release:        NetBSD-current 1.2D as at 1997-05-02
>Environment:
	
System: NetBSD Whisk.DSG.Stanford.EDU 1.2D NetBSD 1.2D (DSG_4K) #13: Fri May 2 19:47:26 PDT 1997 jonathan@Cup.DSG.Stanford.EDU:/aga/n1/src/NetBSD/IP-PLUS/src/sys/arch/i386/compile/DSG_4K i386


>Description:

A `diskless' kernel, when booting off a floppy on an i386, crashes
early during boot (apparently when accessing the root filesystem).

This seems to be non-deterministic and may be due to network traffic
(e.g., ntp chimes) aimed at the MAC address of the `diskless'-booting
host.


>How-To-Repeat:

Build a kernel with a config line
		config		nfsnetbsd 	root ? type nfs

Put the resulting kernel on a floppy (e.g., as netbsd.gz, when
using the 2.0-beta bootblocks.)

Boot on a 3c595 or a de-500.
Observe the kernel panic shortly after printing  messages
identifying the interfaces where it found root and swap.

>Fix:

The following works around the problem for me.

I haven't bothered following through the code and checking that the
patch is realy correct (rather than just masking a symptom.)

The printf() message should be  taken out if the patch  is committed.
(If it helps, I only see one such message per boot.)


*** nfs_socket.c.DIST	Wed Apr  9 04:23:02 1997
--- nfs_socket.c	Fri May  2 18:34:50 1997
***************
*** 663,668 ****
--- 663,679 ----
  		if (nam)
  			m_freem(nam);
  	
+ 
+ 		/* XXX multihomed machines lose? */
+ 		if (mrep == 0) {
+ 			printf("nfs_reply: null mbuf from nfs_receive()\n");
+ #if 0
+ 			return (0);
+ #else
+ 			continue;
+ #endif
+ 		}
+ 
  		/*
  		 * Get the xid and check that it is an rpc reply
  		 */


>Audit-Trail:
>Unformatted: