port-sparc: Re: Breaker... Breaker...

Subject: Re: Breaker... Breaker...
To: PORT-SPARC <Port-SPARC@NetBSD.org>
From: Don Yuniskis <auryn@GCI-Net.com>
List: port-sparc
Date: 05/04/2002 22:27:13
> "John Refling" <johnr@imageworks.com> wrote:
> 
> The break issue on sparcs has been known about
> and beaten to death over and over and over.

Yet, no one has said "this is why it works that way in
a SPARC; these are the SPARCs that *have* the problem;
these are the ones that don't; this is why it works that
way under NBSD..."

Without a nice simple clarification of how and why
it is the way it is, you're likely never to find a better
solution -- unless someone gets fanatical about it!

> It was my impression that there had been a sysctl
> call added which can be set from the command line (or
> in the system startup files) to disable the break
> dropping back to the ROM monitor.  There was definately
> discussion about doing this.

This implies that the BREAK "feature" is not inevitable.
So, what do you *lose* if you disable it?  Can you still
interrupt the boot sequence with a serial console
(i.e. stay at the ok prompt?)

> I have sparc classics and ss5 and plug and unplug pc
> laptops into the serial ports while they are running
> with never an ill effect.

As mentioned, I suspect it depends on whether the
external devices are running/off when plugged/unplugged,
how the cable is mechanically plugged/unplugged, etc.
I have some machines that don't like to be powered up
while connected; others that don't like to be powered
*off* while connected, etc.

> It is a fact that all the laptops and pc clones
> that I have seen generate a break somewhere near
> the power on, so if the pc is plugged into the
> sparc WHILE the pc is powered on, the sparc WILL
> return to the ROM monitor.
> 
> It was my impression that the break happened quite
> a while into the power up sequence on the PC.  If
> this is true, then it would point to the PC bios
> intentionally sending a break as part of the
> initialization sequence, and not due to power supply
> fluctutation [as can certainly be the case in other
> hardware.]  If this is true, one could go in and alter
> the PC's bios, if you REALLY had a lot of free time.

Unless you are also including VERY OLD pc's
(i.e. before the high levels of integration seen nowadays),
I suspect the problem you are seeing is related to the use
of Combo chips to handle the classic I/O's -- serial
ports, floppy, (sometimes) [E]IDE, parallel port,
(sometimes) RTC, etc.  These devices have a sh*tload
of pins and each pin also has several possible uses.
It is not uncommon for a single pin to have 3 or 4
different functions -- depending on how the designer
wants to capitalize on the Combo's features.

Any pin that *can* conceivably be an input *and*
output (if configured for different functions) must,
of necessity (hand waving here), power up in the 
"input" mode -- since it is conceivable htat some
other bit of electronics may be trying to *drive*
that pin *assuming* that it WILL be used as an
input in the design.

Of course, pins that go to the inputs of the "RS232"
transmitters/drivers, would be *outputs* from the Combo
chip.  So, if those pins serve double duty as potential
inputs for some other application, they will power
up as inputs (to be safe) thereby leaving the inputs
to the RS232 drivers "floating".  If they float "high"
(because of built in pullups or leakage currents),
the driver will push teh TxD line leaving the "pc"
to the SPACE-ing condition (< -3V).

Once teh BIOS gets around to reprogramming the 
Combo chip for it's *intended* role in the design,
the pin will become an output and immediately
go "low" (the "MARKing" condition of an idle output)

Depends where in the power up sequence the BIOS 
starts sorting this stuff out.

> A break could be interpreted by attached hardware as
> a request from the PC for a reset.  Who knows what was
> in the mind of the early PC hardware designers?
>
> The processing of the serial bits coming in on the wire
> is done by the UART (a chip), which decides when a break
> occurs, and not the OS.  A bit in one of the status
> registers will be activated when the UART decides a break
> (or overrun, or parity, etc) occurs.  It is up to
> the OS to look at, or ignore, this status word and
> take some action, typically while reading the character,
> eg, test for a transmission error before processing the
> character.

Um, not exactly.  And, it is quite easy to misinterpret
what the BREAK flag is intended to do -- assuming
the UART *has* this ability, implements it correctly,
etc.

A break is defined as a minimum of 2 character widths 
plus 3 bit times (or maybe 5?  I can't recall) of *continuous*
SPACE-ing on the data line.  This must be followed by
2 character times + 5 (3?) bit times of MARKing before
the next "real" character is transmitted.

Most simplistic BREAK detection algorithms use
"ASCII NUL with Framing Error (and SPACEing
parity, if present)" to "detect" a NUL.  At one extreme,
this means a SPACEing condition lasting 9 bit times
(1 start bit + 7 bit data + missed framing bit) is
seen as a BREAK.  Given the 7 bit character width,
a *real* BREAK would be 17 bit times of SPACE.
For 8 bit data (e.g., 8N1), 10 bit times would be 
(mis)detected as a BREAK whereas a real break would
be 19 bit times.

A UART that only reports character based errors
(framing, parity, etc.) will report a BREAK prematurely.
In essence, receipt of a misframed NUL (Gee, we can
have misframed DEL's yet we don't consider *them*
special...  :>).

Furthermore, most UARTs operate by *sampling* the data
line at some overtone of the bit rate.  16 is a common factor.
The "center" of a "bit time" is deduced by noting when the 
line first assumes the SPACE.  Thereafter, the data line is 
sampled at the next "anticipated" center bit time (the center
is sought to minimize the effects of slew rate limits placed
on the drivers, noise on the line, distortion, etc.).  So,
a UART of this type signalling "misframed NUL" is
really only saying that the line was SPACING at the
times it was *observed* -- i.e. a line ringing (or being
*driven*!) at some multiple of the sampling rate can
have MARKs *and* SPACEs yet the UART only sees
the SPACEs -- this line is NOT in a BREAK condition
yet is seen as such (the line isn't *staying* at SPACE
continuously for the duration)..

Fancier UARTs have a BREAK detector that watches 
the line continuously.  The actual  data passed to the
processor is the result of a sampling scheme.  But, the
BREAK detector watches the continuous *level* and
signals the START OF a BREAK if it sees a solid
SPACEing level for the duration of the "character".
I.e. this detector isn't fooled by a line toggling at some 
rate faster than the bit rate.  So, if you receive a
"misframed NUL" on this type of UART and *don't*
see a START OF BREAK signaled, it tells you that
either your line is noisy as all h*ll *or* your bit rate
setting is incorrect.

The second half of this detection scheme is waiting for
a signal from the UART that the data line has returned
to the MARKing condition -- the END OF BREAK
signal.  If you note the relative times at which each of
these boundaries was detected, you can arithmetically
determine if a "genuine" BREAK has been detected
(or, just a misframed NUL possible followed by some
other character with "0"s in the right places to stretch
the missed framing bit into the following start bit and
some possible number of "low" data bits).

And, if you *really* want to bust balls, you wait
for the prescribed time interval verifying an ABSENCE
of additional data (i.e. no further START bits) before
you really report the "BREAK"  (n.b. receipt of a 
character in this interval is sufficient to *discount* the
"real break" -- yet you can't rely on the next character's
occurrence to determine the interval has been long
enough... another character may not come!)

It's also worth noting that some UARTs misbehave in
the time after the START OF BREAK condition
(i.e. once teh framing error has been detected) and
can not reliably detect the character *following*
the BREAK...

I have never seen a "chip" that does all of these things.
The better ones just provide *support* for the OS to
recognize the BREAK (via the START/END OF BREAK
signals) -- but the OS must actively *do* something
more than just readng a single bit in a status register
if it really wants to detect BREAKs (as opposed to
misframed NULs)

> When I looked at the kernel code long ago, the action
> of the NetBSD kernel was to call an entry point in the
> PROM chip (the EPROM on the sparc motherboard) which
> then starts up the ROM monitor.  One could simply
> remove that line from the kernel code and recompile
> a kernel which would ignore breaks.  I think I did that
> long ago, and I think it worked.

OK, then what else do you lose if you do this?  I.e. why
*should* it be in the kernel if you don't intend to do any
kernel hacking, etc.?  Is this a "feature" that should really
have been an *option* instead of "default behaviour"?

> Presumably, the sysctl function (if it were ever
> implemented) controls this from the command line,
> without the need to recompile a kernel.

OK.  So it could also be added to rc.conf

> NetBSD may behave differently on different ports since
> in the past different ports have had different kernel
> options enabled/disabled.  Also, the serial port drivers
> might more fully implemented on some ports.  One could
> have the serial port driver ignore the break bit.
> 
> It has been desirable for various OS to drop to some
> ROM monitor in the past: most/all Sun hardware and
> DEC hardware (VMS) does it.  In my opinion, we should
> not be distributing NetBSD with this option enabled
> due to the robustness issue.  See PR/11946 as you can
> generate a "break" on a 19200 baud line by sending
> a charater at 110 baud.  Don't even need a break key!

Most telecom devices generate BREAKs that are quite
long -- 100ms being a relatively *short* one (200
and 500ms are not uncommon).

> And given the fact that the sparc install floppy will
> change the baud rate of the serial console without
> warning, and the system is likely to crash at install.
> 
> I think this is confusing for new users and irritating
> for everyone else who forgets to unplug their cheap PC
> terminal from the sparc when resetting the PC.

Agreed.  it seems like a feature that "benefits" a small
few -- folks who *should* be knowledgeable enough to
turn it back on if THEY want to put up with the BREAK
issues (because it is presumably giving them a feature they
want/need and are thus willing to tolerate)

> Also, in my opinion, the resistor is not going to do
> much to solve this issue... that was the solution to
> another problem... a terminal (or cable) which did not
> implement some signals (like DTR or CTS) which the OS
> wanted to detect a powered up terminal, and turn on a
> getty.

It is not uncommon for noise to fiddle with open modem
control lines.  Usually, a getty will be spawned, then see
the line "hung up", then reconnected, etc.

> Those OSs were designed that way.
> 
> The "immunity" boxes probably have two UARTs back-to-
> back... the 8 bits collected from the receiving end
> of one are passed to the transmitting part of the
> other.  Each incoming character would strobe the same
> outgoing character.  The status word which would have
> the break bit would proposely not be passed on to the
> transmitting half, and thus the break would be blocked.

That would render it impossible to detect a BREAK.
The box I am using (not intended for this application)
just has a more robust way of detecting BREAK.
So, *if* I send a *genuine* BREAK, it is passed
through the box to the SPARC.  But, if I send a
"misframed NUL", only a (reframed) NUL is passed 
through (using the default behaviour of the box).

> I can't imagine any analogue circuit which would
> reliablly "filter" out the break and pass the "desired"
> signals.

Nope.

> I hope that the sysctl to disable the break action has
> been implemented... This will solve most of our issues.

Or, if there is a way to disable this *before* NBSD sees
it (e.g. in the boot rom configuration?)