Subject: Re: sparc GENERIC and NFS_BOOT_BOOTPARAM versus NFS_BOOT_DHCP
To: Luke Mewburn <lukem@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: port-sparc
Date: 06/08/2003 23:53:41
[ On Monday, June 9, 2003 at 11:56:44 (+1000), Luke Mewburn wrote: ]
> Subject: Re: sparc GENERIC and NFS_BOOT_BOOTPARAM versus NFS_BOOT_DHCP
>
> On Sun, Jun 08, 2003 at 09:33:24PM -0400, Greg A. Woods wrote:
>   | Not everyone can hack their own local DHCP server as easily as they
>   | might be able to run a rarpd and rpc.bootparamd on some machine they can
>   | control.
> 
> That is a bogus argument; if you can edit /etc/bootparams you can edit
> /etc/dhcpd.conf

No, it's absolutely not bogus at all.  Dhcpd and rarpd/rpc.bootparamd
can, and often do, run on completely different machines.  In a modern
LAN it's quite common for there to be a shared DHCP server -- one that
the person wanting to run NetBSD may not have immediate root access to.

> and the documentation and diagnostic information
> available to install and debug dhcpd is FAR better than that which is
> avilable for NetBSD's bootparamd.

Indeed the documentation for netbooting with DHCP may now be somewhat
better, perhaps; but as for debugging and diagnostics, well BOOTPARAM is
a simpler protocol and is much easier to debug by definition.  Like I
said I find the diagnostic messages really are quite adequate, though
perhaps it depends on what you're most familiar with.

>   | This "solution" (i.e. the current state of affairs) also requires that
>   | _everyone_ who wants to netboot a sparc with a stock GENERIC kernel must
>   | run all of rarpd, rcp.bootparamd, and dhcpd, and they must make sure to
>   | configure them all compatibly.
> 
> Wrong;  rpc.bootparamd is not required by the current system.

hmmm... yes, OK, one of the three is not needed.  There's still rarpd
and dhcpd, and as much as I quibble about our rarpd implementation I've
still got to run it none the less.  It also has to be run on the TFTP
server, which is often not the same machine as the DHCP server so that
makes it much more error prone and time consuming to keep the configs in
sync.

>    We've made a conscious decision to
> attempt to make the various NetBSD platforms be more consistent in the
> way that they operate, including netbooting, where feasible.

Well, then perhaps you'll be the one to clean up the misleading text in
diskless(8)?

Meanwhile nobody has answered the most critical question here:  Can
nfs_bootdhcp() be fixed so that it will return a proper error and allow
nfs_bootparam() to take over?  If not then either the order must be
changed because the current situation has broken the ability for anyone
to install via netboot or to boot a previously working system after it
has been upgraded via the NFS server!  I.e. NetBSD/sparc is broken _now_
for everyone who is currently successfully using only BOOTPARAM!

Changing the order will not harm DHCP users, and it will return
functionality to BOOTPARAM users.  As far as I can tell there are no
downsides to this.  RARP has to work anyway (for the prom) so the extra
RARP cycle is only a tiny delay (assuming rarpd keeps working after the
first time) and the RPC/bootparam call should fail immediately if
there's no server running.

Fixing the kernel so that it goes on to try BOOTPARAM when DHCP fails or
when the lease is incomplete is another approach, but is it actually
possible?  Note this solution forces the DHCP timeout delay on
BOOTPARAM-only users, at least until they build a new kernel, but
perhaps this is not too many users.

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>