port-i386: Re: serial console HOWTO?

Subject: Re: serial console HOWTO?
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Miles Nordin <carton@Ivy.NET>
List: port-i386
Date: 01/20/2000 00:05:17
On Wed, 19 Jan 2000, Jonathan Stone wrote:

> "server management microprocessor" and "emp".  You dont get those on
> desktop Wintel motherboards.

Well then you should.  That was my point, i think.  Basic workingness
doesn't belong in a separate add-on option for servers only.

Stratus will sell you an hppa clone that really _is_ more ``reliable''
than a regular HP minicomputer.  Whether this is a good approach to
designing a reliable system overall is a story for a different day.  But
that's not what Intel is selling.  Serial console, error logging,
watchdogs (do they even have a watchdog timer in that thing?), this Intel
stuff is just more giving land back to the Indians(?), as they say.

> a line that's transparent to DC1/DC3,

I don't know what DC1/DC3 is.  I think we're fine here, because I already
explaned that all this only matters if you want to link emacs into the
bootblocks.  Once NetBSD goes multiuser, you can ask for any flow control
you want.

> I want this line to be _more_ reliable than your average PPP line.

Yes.  Of course you do.  So do I.  I want the console to work all the
time, whether it's because I want to extract useful debugging information
from a confused box sitting next to me, or because i want to extract
information and reboot a box that's far away.  That's why I don't want
flow control on it by default!  Certainly, I at least don't want to have
to get flow control up before I can see a boot> prompt.  
flow control != reliability.

I can see why you want flow control on the line after you've got the box
running multiuser.  But, I don't see how it increases ``reliability''. You
experience a failure in reliability when you can no longer control your
box, like, oh, say when you tell your nontechnical on-site lackey to plug
in a terminal server that was working just fine on the Sun in the
neighboring rack, and nothing comes out of your PeeCee's serial port
because RTS isn't wired up.  You power-cycle it--not a peep.  What's going
on?  EMP is supposed to give out error messages.  The console, the
prompt...  You have no control over your machine. You lose.  You rack your
brain, trying to figure out why.

Not having flow control when you need it?  Your screen gets corrupted. You
press ^L.  You use ed or vi instead of emacs.  You are annoyed, but you
keep your box up.  (or, more realistically, nothing happens at all because
the no-flow-control period lasts only through the boot> prompt)  _This_ is
the unique need of a console port. You need a console port when telnet
fails.  It doesn't need to be friendly. It needs to _work_ in cases of
utter desperation.

> If you think of a Sun, or an Alpha, located in a physically-secure
> remote wiring closet (or co-located at an ISP, even), using a serial
> port as console, that's much closer to what I want to do. (The
> tradeoffs in FreeBSD's serial consoles seem much closer to that,
> btw.).

How are FreeBSD's serial console hacks for the PeeCee more usefully like
srm and OpenPROM than ours?

> it'd be nice if the ASCII screen-painting doesnt get _too_ far behind
> the BIOS delay loop. [...] That's a motive to run at high speeds.  
> Which in turn is a motive for flow control....

Have you tried srm or OpenPROM or OpenFirmware?  Maybe you have.  I'm just
surprised you don't find this situation absolutely rediculous.  I would
not want to spend my time writing any code to work around the failings of
such incompetents.  I wish I could leave it at this, but I probably ought
to enumerate the failings, depressing as such work is:

 1. They're fired for presuming the existence of any kind of smart
    terminal before something like termcap/curses is available to do it
    properly.  Do they support ^L screen redraws?  Do they support setting
    your terminal type?  Do they support the character in the bottom right
    square of the screen?  How do they deal with Line 25?  Assuming
    Phoenix is the UberBIOS and the answers to all these
    insolutable questions is ``yes'' and ``magically,'' how big is the
    firmware?  How maintainable is it to keep parallel versions of
    termcap and curses equivalents in the operating system and in the
    firmware?  Will it keep working over time, or will it accumulate bugs 
    due to lack of maintenance?  But, of course, it doesn't do any of
    these things, does it.  This is not unlike the can of worms myself and
    others have opened trying to use AlphaBIOS or ARC or whatever that
    blue-screened nonsense is called, without a display head.

 2. They're fired again for not updating their interactive screens to work
    with serial consoles.  This forces them into #1 above.  It also
    creates problems in that--what if your terminal doesn't support color?
    doesn't support reverse video blinking greyed out peecee line drawing
    characters? uses a finnish character set?  Do the arrow keys work?
    The f-keys?  The obscure control-alt-shift-esc-scrollock combinations?
    I suppose they do this because they wrote the interactive screens
    by hand in assembly language, and haven't fundamentally changed them
    in fifteen years.  They're probably among Phoenix's most valuable
    intellectual property, and besides the pacman video game scum that
    designs those crazy BIOS menus probably has an easier time writing CUI
    code in 8086 assembler than parsing a command line.  ``Grammar?  I
    didn't take English.  I work with Computers for my job.''

 3. They're fired again for this whole concept of ``time windows of
    desperation.''

    Beep.
    Clunk, clunk.
    Debugging information: ......
    Configuration: .....
    ....one second delay...
    [Screen Clears--information dissappears.  where's my Polaroid?]
    [Fancy Phoenix logo]....three second delay...[Bird's wings flap
      once slowly, 3D rendered.  Logo fades out.]
    If your computer or keyboard is broken, Press ESC now.
    ...one second delay...
    Hahaha, too late!
    Missing Operating System.
    Please reboot your computer and start over.

On a Sun, you can send <Break> or press L1-A any time you like, and you
get an ``ok'' prompt.  It doesn't matter if the operating system was
sought but not found, is loading, is loaded, is running and has 10 people
logged in.  <Break> gets you an ok prompt.  There is no time delay.  No
window of opportunity.  You press that key any time you want and you get
friggin' prompt, that simple.

Now, Intel on the other hand, gives you the chance to reboot your system
with a virtual-power-cord-pulling.  You get another three softballs and
you can play dunk-the-clown all over again, as many times as you like.
Some day maybe you'll get the hang of it!  Oh, and by the way, you have to
dedicate a serial port _just_ to powering off the machine.  You can use
the other port to play dunk-the-clown!

Did I say I was going to agree to disagree? :)

> In an ideal world, the bootblocks could ascertain that the BIOS is
> using a serial line as console, obtain the line speed, etc. from where
> the BIOS keeps it, and pass it onto the console.

No, not in my ideal world at least.  In an ideal world, such as exists on
macppc, sparc, alpha, arm32/shark and possibly others, the bootblocks do
not ascertain anything. They simply say, ``here's a line of output text.
Render it to the framebuffer, or send it over the serial line, or beep the
speaker in morse code--whatever you're doing these days.''

You don't need to know what _device_ is doing the console until you're
ready to _take over_ that device, which is half way through the kernel's
startup procedure.

This is why NetBSD/sparc booted on Tadpole's with Real Computer Firmware
even though NetBSD had no idea how to use the crazy video chip inside them
at the time.  NetBSD said, ``What's the console on?''
OpenFirmware said, ``It's on a Weitek P9100 attached at _______''
NetBSD said, ``Yetch!  I've never heard of one of those.  I'll just let
  you know when I want to write some text to the screen, k?''

That's what would happen in an ideal world.  In a much less ideal world,
if the BIOS could simply refrain from taking over the serial port so that
no one else could use it, that'd be great.  Likewise, if the BIOS could
refrain from deliberately, activeily trying to deceive the OS into
believing it had a screen and keyboard plugged in when it didn't, that
would help too. Passing down explicit information? bah, that's rather a
tall order compared to simply _not telling lies_. what would Phoenix
Technical Document 353-24.19 really buy you?  Yet another channel for
lies, damn lies. ``No, this time, we really mean it! We're really using a
serial console/We're really using a screen.  You can trust us. This is
Revision 45.24 of the Bios98 protocol, updated with all the latest
features in Real Computing.  Would we lie?  Would we tell you there was a
screen when there wasn't?  Go ahead.  It's there.  use it.  _trust us_.''

-- 
Miles Nordin / v:+1 720 841-8308 fax:+1 530 579-8680
555 Bryant Street PMB 182 / Palo Alto, CA 94301-1700 / US