port-i386: Re: serial console HOWTO?

Subject: Re: serial console HOWTO?
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Miles Nordin <carton@Ivy.NET>
List: port-i386
Date: 01/21/2000 20:43:14
okay, first of all, my name's Miles, not Mike.  I think Mike was also
involved in this thread with you, and he probably doesn't want my ranting
attributed to him.  I don't really mind but I better point it out.

On Fri, 21 Jan 2000, Jonathan Stone wrote:

> you must have a very different model of console usage, or system
> usage and management,

The one in my head is perhaps different.  But, I do have a machine set up
almost exactly as you described the one at Stanford.  I'm a student and
move a lot, so I put my email computer in an offsite place where it can
stay, and I have to get on an airplane if I want to visit it.  The people
who tend it have minimal technical experience. The machine is a sparc, and
it can be managed over a serial console and a modem. When the box goes
down, so does my email, which I read at least three times a day and stay
logged into from my bedside terminal at night. I'm not unaware of the
reliability concerns you're raising, but maybe I'm not responding to them
well enough.

> I am not asking for an embedded OS. I'm asking for robust,
> remotely-manageable servers.
[...]
> > bent around this one Task you're trying to do.
> 
> Mike, that I do find offensive.  Just how long do you think I've been
> dealing with remote system administration?

I have also been dealing with this problem for a very long time, too.
Particularly on PeeCee junk.  My FidoNet BBS had one of the best
pcAnywhere-based DOS shells around.  I could even reboot the computer
without hanging up the modem.  (which required a separate program,
REBOOT.COM, from some utilities collection.)  It took me days to get it
right and was littered with .FLG files and special cases, but it was
exceedingly useful. Such experience is questionably relevant, but I've
been around the block as far as ``remote control'' and boxes I'm
responsible for but can't touch, and I know what it's like when you lose
control of them, and I have this spectre hanging over my head every day
right now.  So, I care.

I doubt you'd be on this list and using NetBSD in the first place if you
didn't have a very unique and broad perspective on the industry.  

Nevertheless, you are one person.  You have chiefly one situation in mind.  
Others have valuable opinions.  It would be great if the serial boot
blocks worked better for your job, but they need to work well for other
people's as well, and they also need to remain a clean, well-coded example
of how to implement something properly.  A lot of experience has
contributed to the way things work in this regard, and some of the
reminisces have creeped into this thread. That doesn't mean we can't
change things, but I really wish you'd have more respect for the needs and
experiences that others have brought up. For one thing, there are almost
certainly people on this list who have used architectures you haven't.  

I imagine this is true for almost any reader.  I've only used three or
four architectures myself, but have been positively astounded at the
strange and interesting ways other comptuers do things.  This experience
_is_ relevant to the discussion at hand.


> Mike, please read that FreeBSD URL I re-posted.

It turns out the URL you posted was incorrect, a typo or something.  The
one I read was:

 http://www.freebsd.org/handbook/x9858.html

Their bootblocks do some things ours don't.  I particularly like the
ability to force serial or screen console without recompiling the boot
blocks.  I also like the keyboard probe option.  At the same time, the
``config file'' stuff is a mess.  It's split across /boot.config and
/boot/loader.rc, which strikes me as confusing and gratuitous.  I'm not
sure how this config-file stuff generalizes to network booting, either.

Another thing we need to all get straight:  the -D option to use both
serial and screen consoles at once is documented to work _only_ in the
bootblocks. Once the FreeBSD ``loader'' (their third stage) and the kernel
get going, output goes to only one or the other.  To explain which one,
they privide a table with eight different cases.  This strikes me as so
confusing and minimally useful that it would be better left out if it's
going to be done like they did it.


> We have _bootloader_ options that need to be persistent.

We have bootloader #define's right now, and a few prebuilt bootloader
alternatives.  You want to make them into runtime options then?  I agree
that would be nice.

I think it needs to be kept really small and simple, though.  I don't like
the idea of the bootloader not being able to figure out on which console
the user is sitting until after it has already found a mountable
filesystem, or perhaps even gotten to the novel ``third stage'' of
booting. You may want to interact with it to help it find said filesystem.  
Actually, the first stage bootblock probably can't support user
interaction, but at least you may want to eventually, some day, define a
bootblock-option that influences where the second-stage bootblock (and
``config file'') are sought.

I would suggest storing the options in a small (say, between 32 and 96
bytes) area actually in the first-stage loader (the first block).  The
options would be analagous to the tiny, tiny NVRAM on NCD X terminals. The
area could be read/set both by issuing commands through installboot from a
booted OS, or by interacting with the (second-stage) bootloader which
would use BIOS calls to rewrite the first-stage sector.

This preserves the useful feel of Sun's OpenPROM in that you can set
firmware environment variables both from the 'ok' prompt (which is burned
into the ROMs) and from the booted operating system using the 'eeprom'
command:

$ MACHINE=sparc man eeprom

as background, the NVRAM on NCD X terminals is used as a last-ditch way to
get the terminal booted in strange situations.  The NVRAM can be in a
``reset'' or empty state, in which the terminal will send out BOOTP
requests, and then tftp a fixed-name config file that contains more option
settings in the simple textual form

option=value

(There is also a notion of ``tables'' using {}'s and commas, but that's
not important.)

In the typical case, people leave their NVRAM empty and put all options in
the config file.

A subset of the general configuration language can be ``written'' into the
NVRAM using config-file commands.  The terminal compresses and encodes the
option=value information and puts it in the NVRAM, which doesn't assign
memory ranges to fields but rather has a limited amount of ``string
space'' and ``numeric IP address space'' that can be used in various ways.  
For example, you could fix the terminal's IP address and avoid BOOTP, or
you could tell it to use a specific config file name instead of the
default.  If you try to set too many options in NVRAM, you run out of
{s,n}-space.

A la NCD, if there isn't enough room in the first sector for the type of
options you want, maybe the second stage loader could read more options
from a file on the filesystem.  But, I think it's important to have a few
options available instantly so that one can potentially debug and control
as many problems as possible, including failure to load the second-stage
bootstrap.

If we are to follow NCD's example and have two config files (a small
machine-readable first-stage configspace and a larger, human-readable
config file), I think we should also try to follow the spirit of their
work by making the first-stage config space a subset of the config file.
Ex.:

----
# booter configuration file
first-stage {
 option = value
 option = value
}
option = value
option = value
option = value
----

first-stage {...} would be encoded by installboot and written into the
32-to-96-byte configspace in the first block.

This is a can-of-worms, because the second-stage loader is only
hypothetically smart enough to update the first-stage configspace--it
can't write to an FFS.  I don't know how to deal with this properly.  Some
program (run at boot perhaps) would have to synchronize the two files
whenever someone changed an option from the boot> prompt.

This scheme is generalizable to network boot blocks, except that the
first-stage configuration area would be unchangeable because it would be
burned into EEPROM, and the second-stage configuration area would be
loaded via TFTP. So, there would be no making of persistent settings from
'boot>'.  On the plus-side, it finally answers the need of folks
who have before asked for ROM's that presume a fixed IP address and don't 
use DHCP.

Of course, if you're going to write the code you are totally free to
disregard my ideas!  I thought that'd be a really sweet way to do it,
though.  Much nicer than LILO's way of stuffing them into a ``map'' file
that was ``special'' and couldn't even be edited with a text editor at a
single-user shell, much less changed from the boot prompt!

> But we dont have one-time overrides.

That's not completely true.  That's exactly what we do have.  All the boot
options (-s, -d, -a) are one-time overrides.  For overrides of more
complex variables inside the kernel, we have ddb, which replaces Linux's
``kernel command line.''

What we don't have, is any way to customize boot block behaviour without
recompiling them.

> I have no problem with learning more. [...] But DDB is not really an
> acceptable tool for endusers or for "production" systems.

Okay, we differ here.  I feel that if you're willing to learn ``more,''
more than usual, then ddb is a perfectly acceptable replacement for kernel
command lines.  There is definitely a continuum here, of how _much_ more
you are willing to learn, and it can obviously be taken to absurdity.  I
could give you a hex memory editor and say, ``if you were just willing to
LEARN, you could parse the ELF symbol table by hand, and with a simple HP
calculator you could edit these varialbes with no problem at all.'' My
personal opinion is that ddb is not a kludge--it's a superior answer to
the one-time override issue for which Linux uses kernel command lines
passed by LILO.  It's more versatile, more maintainable, and keeps
nonsense out of the bootblocks. There have been numerous examples of
people using ddb-on-boot effectively on the list, and a few of people
teaching others to use it effectively, which I found really impressive and
inspiring.

However, I think this is a dead issue.  We aren't arguing about it,
because your project hasn't yet run into a problem that requires
ddb-on-boot optionsetting.

> NetBSD does NOT automatically select consoles.

whoa, no, this is _really_ key to what I've been saying here.  NetBSD
absolutely _does_ have code to do this.  I don't use it myself, but have
every reason to believe it works.  Because this is so hard to do on the
PeeCee, it's complex--it can tend toward the screen, or tend toward the
COM port.  It can be based on the presence or absence of COM hardware, or
it can explicitly require a keypress on the chosen console.  It _can_ be
set up to use one console most of the time, but offer to use the other if
the user presses key there.  It is totally automatic--it's not as good as
Sun's, Alpha's, and I think maybe FreeBSD, though:  they will
automatically choose the serial port if there's no keyboard plugged in.  
So, the code could be improved, but it's _there_ and it's already fairly
good and useful.

The problem is, it doesn't work with your @#%$ ServerBIOS hack!  The
reason it doesn't work, to me, looks like ``because ServerBIOS is broken,
and was foolishly implemented from the beginning.'' ServerBIOS undermines
and befuddles the bootblock's ability to auto-detect the console.  And,
given what you've told me about it, it's unreasonable to expect an
operating system to do any better than NetBSD does when it's got
ServerBIOS running over its head and deliberately maximizing confusion.

And, I fail to see how FreeBSD currently does a better job of
``autodetecting'' on ServerBIOS machines than we do.  It is just as
manual.  It simply lets you play around with the problem without
recompiling your bootblocks.

If you can code up a scheme to improve console autodetection with
ServerBIOS that doesn't break bootblocks for the rest of us, that's great!
I've told you why I would personally have zero interest in ``enabling''
such a poorly-written technology as ServerBIOS, and am pessimistic about
the feasibility of such a project, but if you're interested, go for it,
right? It's not an unreasonable thing to want. But it's definitely an
unreasonable thing to ask others to write _for you,_ based on the
assertion that NetBSD is somehow broken.

Before you say NetBSD's console autodetection is broken, you really need
to try it on a machine without what I _still_ contend are foolish and
broken BIOS hacks.  That dosen't mean they're not useful to you in some
ways.  It doesn't even mean Intel isn't going to force said destructive
BIOS hacks down everyone's throat in the near future.  It just means
they're probably broken w.r.t. NetBSD, and due to their poor
implementation, necessarily interfere with console autodetection that
otherwise works well.

-- 
Miles Nordin / v:+1 720 841-8308 fax:+1 530 579-8680
555 Bryant Street PMB 182 / Palo Alto, CA 94301-1700 / US