Subject: sparc GENERIC and NFS_BOOT_BOOTPARAM versus NFS_BOOT_DHCP
To: NetBSD/sparc Discussion List <port-sparc@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: port-sparc
Date: 06/08/2003 14:46:57
I've just been trying to netboot an SS20, via rarpd and rpc.bootparamd,
with a GENERIC kernel built recently from the netbsd-1-6 branch.  This
kernel of course has the following options:

	## NFS boot options; tries DHCP/BOOTP then BOOTPARAM
	options         NFS_BOOT_BOOTPARAM
	#options        NFS_BOOT_BOOTP
	options         NFS_BOOT_DHCP

In my network it is very unfortunate that it tries to get its root
filesystem parameters from DHCP before it tries BOOTPARAM, since that
will never work.  If it tries DHCP first and then it will gets assigned
an address from my dynamic range by default and then it will fail
because there's obviously no default "option root-path" given in the
leases for my dynamic address range.

I know I can get this to work by commenting out the NFS_BOOT_DCHP line
because that's what I've done for similar systems.  (I had just forgot
to comment it out in this kernel.)

I'm assuming from the following comment in diskless(8) that doing so
should not be necessary as it's not the most desired way of doing
things:

                In general, the GENERIC config(8) files
          for any particular architecture will specify options to activate in
          the kernel the same protocol used by the boot program for that ar-
          chitecture, 

The code in sys/nfs/nfs_boot.c seems to suggest that it should go on to
try BOOTPARAM if the DHCP attempt fails:

        error = EADDRNOTAVAIL; /* ??? */
#if defined(NFS_BOOT_BOOTP) || defined(NFS_BOOT_DHCP)
        if (error && nfs_boot_rfc951) {
#if defined(NFS_BOOT_DHCP)
                printf("nfs_boot: trying DHCP/BOOTP\n");
#else
                printf("nfs_boot: trying BOOTP\n");
#endif
                error = nfs_bootdhcp(nd, procp);
        }
#endif
#ifdef NFS_BOOT_BOOTPARAM
        if (error && nfs_boot_bootparam) {
                printf("nfs_boot: trying RARP (and RPC/bootparam)\n");
                error = nfs_bootparam(nd, procp);
        }
#endif
        if (error)
                return (error);

I've not yet looked at nfs_bootdhcp() in detail, but I'm assuming it
doesn't "fail properly" due to the fact that it is getting a DHCP reply,
but just not a complete and usable one.

However would it not make a huge amount more sense to try BOOTPARAM
first?  I suspect the PROM and the boot program are always going to use
BOOTPARAM, so GENERIC should as well as per diskless(8).

Actually I'm not even sure I see any point to having NFS_BOOT_DHCP in a
GENERIC sparc kernel in the first place (other than for the "Gee Whiz!"
factor).  You need a properly configured and running rpc.bootparamd
anyway so normally there's nothing to be gained and only the potential
for consistency errors if you have to spread your configs about and
duplicate bits of information unnecessarily.


FYI here's the whole console session and the server's related logs

Rebooting with command:                                               
Boot device: /iommu/sbus/ledma@f,400010/le@f,c00000  File and args: 
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Automatic network cable selection succeeded : Using TP Ethernet Interface 
13800 
Server IP address: 204.92.254.18 
Client IP address: 204.92.254.10 
>> NetBSD/sparc Secondary Boot, Revision 1.9
>> (woods@sometimes, Tue Oct 23 04:35:04 EDT 2001)
Booting netbsd
Automatic network cable selection succeeded : Using TP Ethernet Interface 
Using BOOTPARAMS protocol: ip address: 204.92.254.10, hostname: almost.weird.com
root addr=204.92.254.18 path=/almost
2846168+100084+258944 [68+182752+139533]=0x36d544
OBP version 3, revision 2.19 (plugin rev 2)
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 1.6.1_STABLE (GENERIC) #0: Mon May 26 23:11:30 EDT 2003
    woods@proven:/var/obj/NetBSD/arch/sparc/compile/GENERIC
total memory = 175 MB
nbuf at 2269 is too large for VM_MAX_KERNEL_BUF... adjusted to 896
avail memory = 158 MB
using 896 buffers containing 9076 KB of memory
bootpath: /iommu@f,e0000000/sbus@f,e0001000/ledma@f,400010/le@f,c00000
mainbus0 (root): SUNW,SPARCstation-20
cpu0 at mainbus0: TMS390Z50 v0 or TMS390Z55 @ 60 MHz, on-chip FPU
cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 b/l): cache enabled
obio0 at mainbus0
clock0 at obio0 slot 0 offset 0x200000: mk48t08: hostid 72716a18
timer0 at obio0 slot 0 offset 0x300000: delay constant 28
zs0 at obio0 slot 0 offset 0x100000 level 12 softpri 6
zstty0 at zs0 channel 0 (console i/o)
zstty1 at zs0 channel 1
zs1 at obio0 slot 0 offset 0x0 level 12 softpri 6
kbd0 at zs1 channel 0: baud rate 1200
ms0 at zs1 channel 1: baud rate 1200
fdc0 at obio0 slot 0 offset 0x700000 level 11: no drives attached
auxreg0 at obio0 slot 0 offset 0x800000
power0 at obio0 slot 0 offset 0xa01000 level 2
iommu0 at mainbus0 ioaddr 0xe0000000: version 0x3/0x1, page-size 4096, range 64MB
sbus0 at iommu0: clock = 25 MHz
dma0 at sbus0 slot 15 offset 0x400000: dma rev 2
esp0 at dma0 slot 15 offset 0x800000 level 4: ESP200, 40MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
ledma0 at sbus0 slot 15 offset 0x400010: dma rev 2
le0 at ledma0 slot 15 offset 0xc00000 level 6: address 08:00:20:71:6a:18
le0: 8 receive buffers, 2 transmit buffers
bpp0 at sbus0 slot 15 offset 0x4800000 level 2 (ipl 3): dma rev 2
SUNW,DBRIe at sbus0 slot 14 offset 0x10000 level 9 not configured
eccmemctl0 at mainbus0 ioaddr 0x0: version 0x0/0x2
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 1 lun 0: <SEAGATE, ST32550W SUN2.1G, 0418> SCSI2 0/direct fixed
sd0: 2048 MB, 3511 cyl, 11 head, 108 sec, 512 bytes/sect x 4194995 sectors
sd0: sync (100.0ns offset 15), 8-bit (10.000MB/s) transfers, tagged queueing
sd1 at scsibus0 target 3 lun 0: <SEAGATE, ST32550W SUN2.1G, 0418> SCSI2 0/direct fixed
sd1: 2048 MB, 3511 cyl, 11 head, 108 sec, 512 bytes/sect x 4194995 sectors
sd1: sync (100.0ns offset 15), 8-bit (10.000MB/s) transfers, tagged queueing
Kernelized RAIDframe activated
sd0: no disk label
sd1: no disk label
root on le0
nfs_boot: trying DHCP/BOOTP
nfs_boot: DHCP next-server: 204.92.254.2
nfs_boot: my_domain=weird.com
nfs_boot: my_addr=204.92.254.140
nfs_boot: my_mask=255.255.255.0
nfs_boot: gateway=204.92.254.6
nfs_boot: getfh - no pathname
no file system for le0
cannot mount root, error = 79
root device (default le0): 


Clearly no second BOOTPARAM query is being logged by rpc.bootparamd:

Jun  8 14:01:15 sometimes rarpd[10833]: received packet on le0
Jun  8 14:01:15 sometimes rarpd[10833]: 08:00:20:71:6a:18 asked; almost.weird.com replied
Jun  8 14:01:15 sometimes rarpd[10833]: 08:00:20:71:6a:18 asked; almost.weird.com replied
Jun  8 14:01:15 sometimes inetd[16542]: incoming connection from [204.92.254.10:57571], for service [*:tftp]/udp at local address [0.0.0.0:69]
Jun  8 18:01:15 sometimes tftpd[16542]: running as user `nobody' (2147483647), group `(unspecified)' (2147483647)
Jun  8 18:01:15 sometimes tftpd[16543]: almost.weird.com: read request for CC5CFE0A.SUN4M: success
Jun  8 14:01:17 sometimes rarpd[10833]: received packet on le0
Jun  8 14:01:17 sometimes rarpd[10833]: 08:00:20:71:6a:18 asked; almost.weird.com replied
Jun  8 14:01:17 sometimes rarpd[10833]: 08:00:20:71:6a:18 asked; almost.weird.com replied
Jun  8 14:01:18 sometimes rpc.bootparamd: whoami got question for 204.92.254.10
Jun  8 14:01:18 sometimes rpc.bootparamd: This is host almost.weird.com
Jun  8 14:01:18 sometimes rpc.bootparamd: Returning almost.weird.com   .weird.com    204.92.254.18
Jun  8 14:01:18 sometimes rpc.bootparamd: getfile got question for "almost.weird.com" and file "root"
Jun  8 14:01:18 sometimes rpc.bootparamd: returning server:sometimes.weird.com path:/almost address: 204.92.254.18
Jun  8 14:01:19 sometimes rpcbind: connect from 204.92.254.10 to getport/addr(mountd)
Jun  8 14:01:24 sometimes rpcbind: connect from 204.92.254.10 to getport/addr(nfs)

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>