port-sparc: Re: Breaker... Breaker...

Subject: Re: Breaker... Breaker...
To: NetBSD/sparc Discussion List <port-sparc@NetBSD.ORG>
From: Don Yuniskis <auryn@GCI-Net.com>
List: port-sparc
Date: 05/06/2002 10:48:26
> "Greg A. Woods" <woods@weird.com> wrote:
> [ On Saturday, May 4, 2002 at 22:27:13 (-0700), Don Yuniskis wrote: ]
> > Subject: Re: Breaker... Breaker...
> >
> > > "John Refling" <johnr@imageworks.com> wrote:
> > >
> > > The break issue on sparcs has been known about
> > > and beaten to death over and over and over.
> >
> > Yet, no one has said "this is why it works that way in
> > a SPARC; these are the SPARCs that *have* the problem;
> > these are the ones that don't; this is why it works that
> > way under NBSD..."
> >
> > Without a nice simple clarification of how and why
> > it is the way it is, you're likely never to find a better
> > solution -- unless someone gets fanatical about it!
>
> On most any unix-like system, particularly servers, there's a notion of
> having a "console" TTY-like device which is where the operator can
> interact with first the system firmware, and later the operating system

[snip]

Sorry, I guess my question about "why it is the way it is" was too vague.
<:-(  I meant why the software is written/layered the way it is -- and,
how exactly that is!  I.e. so that I could imagine ways to bend the
implementation to make "featrue" this more robust.

E.g., on boxes that I design, I have a monitor running contnuously
(as a sort of "privileged task/process/thread").  Since it is an active
and independant entity coexisting ALONGSIDE the rest of the
system's tasks (lets not quibble over choice of terms), the method(s)
by which it is "invoked"/activated/applied can be more flexibly
chosen.

For example, since my monitors don't run IN LIEU OF the system,
there is no need to make the decision to invoke/apply the monitor
based on a single keystroke *and* as soon as that keystroke is
detected.  In my case, no other threads/tasks/processes/etc. are
affected by the activities at teh "console" until a *command* directing
the monitor to *attach* to a particular thread/etc. is received.
Once a particular thread is attached, then the monitor preempts that
thread, allowing you to examine/execise/step/modify it -- while
the rest of the system proceeds oblivious to this (except that any
interactions with this thread are now obviously slowed...).

But, this is mainly intended as a software *development* tool.
"Poor man's ICE", etc.  And, since removing the code from the
production release would *change* that released software (so
that it no longer exactly corresponds with the software that was
tested/validated prior to release), the monitor remains active in
teh released product in perpetuity (which has the benefit of acting
as a nice troubleshooting tool "in the field").

Anyway, given my rough understanding of how the BREAK is
layered in the software, (though without any first hand knowledge
of how "capapble"/flexible the kernel design is in terms of these
sorts of "micro-details") perhaps some similar mechanism
could be applied to effectively *filter* the BREAKs seen by
the kernel.

For example, seeing the BREAK and just *remembering* that
you saw it.  Then, waiting for some (multiple?) character sequence
to actually cause the trap to the monitor.  Perhaps something as
hokey as
    BREAK 's' 't' 'o' 'p'
Of course, this means folks who want to use this feature are
now burdened with this (annoying) keystroke sequence.  So,
perhaps something easier
    BREAK  BREAK
(with a timeout to "forget" the first BREAK after some amount of time)

Depends on how much of a hassle the "unplugged terminal"
is considered BY THESE USERS -- assuming "typical" users
would opt to just disable the BREAK altogether (the only thing
I have ever typed after a BREAK has been "go"  :>  )

> > This implies that the BREAK "feature" is not inevitable.
> > So, what do you *lose* if you disable it?  Can you still
> > interrupt the boot sequence with a serial console
> > (i.e. stay at the ok prompt?)
>
> Well, on a sun4-class sparcstation in particular, that'll depend on
> exactly where and how you disable it.
>
> If you use hardware logic to intercept and re-process your serial data
> to ensure a BREAK condition can never occur then you'll never be able to
> interrupt a boot or hard-halt the system from multi-user mode, or drop
> into the debugger (ddb(8)) on demand, etc.

With the exception of interrupting a boot, I suspect most "typical"
users would easily forfeit the other concessions.

[As an aside, can you still interrupt the boot using the *keyboard*
if the BREAK is *completely* disabled?]

> If you take out the code in the kernel which handles the BREAK condition
> on the console serial port then I think you'll be able to interrupt a
> boot up until the point where the kernel console code is attached and

How soon/late is that?  I.e. could you quit before the kernel completely
loads?  Could you quit while all the probe()s are taking place?

> activated, but you will not be drop to the monitor prompt or debugger
> prompt.
>
> There's still the BREAK detection in the firmware, of course...
>
> All the issues you discuss w.r.t. UARTs and how they detect BREAK are of
> course very real.  The older Sun sparcstations all (IIRC) use the z8530
> UART (one for both ttya/ttyb and another for kybd/mouse).  The tty code
> sets bit 0x80 in register 15 to enable BREAK status interrupts and then
> checks bit 0x80 in register 0 to see if an interrupt was due to BREAK.

Argh!  I haven't used an SCC -- since last CENTURY!  :>  Undoubtedly
(I don't have databook handy... one of these days I will unpack the 100+
cases of books in the garage..), the first 0x80 referenced sets the "IE" bit
for BREAK -- "interrupt me when you (think you) see a BREAK".
The second 0x80 reflects the state of the "BREAK detected" flag.

But, the "BREAK detect" bit is really little more than "START OF BREAK".
Or, "START OF *POSSIBLE* BREAK" -- since it only tells you that
you have seen 1 character + 2 bit times (1 start, 1 framing -- assuming
parity set to NONE) of SPACEing.  You still need another whole character
time (plus another bit time -- that of the START bit in the "character"
following this "misframed NUL") to get a bonafide BREAK.

Fortunately (?  Heh heh heh), the SCC doesn't sense MARK->SPACE
transitions to signal the START bit detection.  (some UARTs do -- so,
once a BREAK starts, they will not recognize the start of a *new*
character until the BREAK terminates)  So, once the "misframed NUL"
is detected, the SCC will keep receiving "characters" as long as the
RxD line remains in the SPACE state (i.e if held to SPACE, it will keep
receiving an endless stream of NULs).  This can be exploited to do a
more "accurate" BREAK detect -- if the first character following the
START OF BREAK detected is also a NUL (framed *or* misframed,
though if framed properly, no other NULs -- see below -- can follow)
*and* can be verified as having *immediately* followed the first NUL
(i.e. so you can state for sure that the data line did not return to MARK
between the characters), then you have a bonafide BREAK.

Note that any number of misframed NULs can then occur.  And, the
"last" character in this "string" will be one in which there are no "0"
bits more significant than the least significant "1" bit  (e.g., 0x00, 0x80,
0xC0, 0xE0, 0xF0, 0xF8... 0xFF).  If the last sch character is a NUL,
it may be framed or misframed.  All other characters must be framed
properly.

IIRC, the SCC will generate a *second* "BREAK" interrupt when
the RxD line returns to MARKing, again -- an END OF BREAK
interrupt.  If your interrupt latency is low enough <big grin> *and*
you know you won't drop any interrupts, you can use these
START/END interrupts to detect the event -- and then just qualify
it with a gross count of the number of bits/characters between those
events.

Note that I have ignored the role of parity in all this... :-(

If my memory is a bit fuzzy, my apologies.  If someone wants to
dig into this deeper, I can possibly be persuaded to try to locate
any SCC materials in my "stash" (though I still have two workstations
to get up and running here -- "So little time, so much to do...")