Subject: Re: sparc GENERIC and NFS_BOOT_BOOTPARAM versus NFS_BOOT_DHCP
To: Luke Mewburn <lukem@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: port-sparc
Date: 06/08/2003 21:33:24
[ On Monday, June 9, 2003 at 10:13:41 (+1000), Luke Mewburn wrote: ]
> Subject: Re: sparc GENERIC and NFS_BOOT_BOOTPARAM versus NFS_BOOT_DHCP
>
> NetBSD/sparc *used* to try bootparams first, and then dhcp, and then
> I changed it to the other way around (after consultation with the port
> master Paul Kranenburg, as well as Matt Green), to be more consistent
> with how almost everything else diskless boots in NetBSD.  This is
> documented in the NetBSD/sparc 1.6 install notes.

I had a vague memory that it might have been different before....

> FWIW: I find that dhcp & tftp is far more reliable that bootparams,
> and far easier to debug; rpc.bootparamd in NetBSD has sub optimal
> diagnostics.

I find BOOTPARAM far easier to debug than DHCP, at least when the server
is running NetBSD or SunOS.  I've never had a problem with the state of
the diagnostic messages from NetBSD's rpc.bootparamd as it does its
syslogging at the same verbosity as it would ramble on with '-d'.  It
might help some if it didn't use LOG_NOTICE as the priority for _every_
message (LOG_INFO would be fine for some), but that's a separate issue.

As for reliability, well I've found the real problem is with RARP,
especially NetBSD's rarpd, and not with any implementation of either
BOOTPARAM or DHCP that I've used for this purpose.  DHCP doesn't really
help any here though because you still need RARP for each and every
NetBSD/sparc system (at least so far as I've been able to find out).

(For sparc64 you need at least OpenBoot-3.25 or newer for most Sun Ultra
systems or 4.x generally, before the ":dhcp" and ":bootp" modifiers are
available for the "boot net" command.)

> The simple solution for you is to add an appropriate entry in your
> dhcp server configuration.

Not everyone can hack their own local DHCP server as easily as they
might be able to run a rarpd and rpc.bootparamd on some machine they can
control.  It's my experience that trying to run multiple DHCP servers on
the same network is far less reliable than running multiple BOOTPARAM
servers, though that may have been my fault.  :-)

This "solution" (i.e. the current state of affairs) also requires that
_everyone_ who wants to netboot a sparc with a stock GENERIC kernel must
run all of rarpd, rcp.bootparamd, and dhcpd, and they must make sure to
configure them all compatibly.  This is a big departure for anyone
accustomed only to ancient SunOS where rarpd and rpc.bootparamd are
sufficient.

>  Or you could maintain local patches to
> your sources.

I definitely will maintain local patches (for now I'll just drop
NFS_BOOT_DHCP from all my sparc kernels), but I brought this up on the
list because I think there's now a fundamental discrepancy between the
point of view expressed by diskless(8) (which I argue is the more
correct one) and the way that applies to NetBSD/sparc, vs. the way the
code works for NetBSD/sparc.

Ideally on NetBSD/sparc64 both boot.net and the kernel would look to see
if ":dhcp" or ":bootp" were used with the "boot net" command and behave
appropriately.  Is it possible to get the boot command string from the
prom?

The bigger question is also whether it's possible or not to fix
NFS_BOOT_DHCP so that it will actually go on to try BOOTPARAM if it
doesn't get a non-functional or incomplete DHCP lease.

I also think it may still make more sense to always try BOOTPARAM first
(in the GENERIC kernel at least, and even on all ports but certainly on
sparc and sparc64 where RARP/BOOTPARAM are the default) because with
BOOTPARAM you'll be far more likely to always either get a proper and
complete response, or no response at all, and then if the latter the
code can more easily go on to try for a DHCP lease.  However with DHCP
it's far more likely that the kernel will get a random dynamic lease on
the average site's network than one with the correct root-path.  If I
remember correctly even the timeout delays are more favourable for
BOOTPARAM as well, though RARP delays probably balance this factor out.

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>