Subject: Re: Losing serial connection
To: None <port-macppc@netbsd.org>
From: Donald Lee <donlee_ppc@icompute.com>
List: port-macppc
Date: 10/18/2001 00:35:48
Several people have written:
>josh> For some reason the connection is useable for a while and then ceases
>josh> to function.
>
>der Mouse> I have a Power Macintosh 4400 which exhibits very similar
>symptoms on
>der Mouse> its console serial line under 1.4T.  I suspect hardware
>trouble, but I
>der Mouse> have nothing I can point to as evidence of that.
>
>  I also have Several OF 2.0.2 machines. At OF console, ttya
>connection, I happen to have copy and paste some commands.
>
>  If I paste two lines, the console cease to work. If it is before the
>multi-user mode started, I need to hareware-reset to recover.
>
>  It seems to me just overflow of serial input buffer which serial
>chip can not handle.

I had a similar problem with my Cyclades 8 port serial board.
The serial I/O would work "for a while" and then stop, requiring
a reboot to come alive again.  It would fail more quickly under
heavier load.

I fixed it.

What I did was add essentially a "heartbeat" checker to the driver
so that it comes around every so many ticks and checks the interrupt
register of the card.  To my surprise, it would sometimes end up
being set.  I changed the driver so that if it was found to be
set (and the interrupt handler had not been called) it would simply
"manually" call the interrupt handler.  If I failed to do this, the
interrupt would stay set, and never be cleared, rendering the port
dead.

I have been running this way in production for some time.  This is
the cause of an occasional hiccup in serial performance, but is
otherwise AOK.  I futzed with the heartbeat so that in most cases
it would catch the "lost" interrupt quickly.  Currently the driver
logs every N occurrances of this "lost interrupt", and I see them
regularly.

I have concluded that there is something busted in the low-level
interrupt handler in MacPPC NetBSD, but have been unable to find the
problem.  I am about 95% certain that this is not a problem with
the Cyclades driver, and stories like this one lead me to believe that
it is happening other places too.

I would be grateful for pointers from anyone who might be able to help
me debug this.

-dgl-