Subject: tty layer bogons from the depths of the abyss
To: None <tech-kern@NetBSD.ORG>
From: Charles M. Hannum <mycroft@mit.edu>
List: tech-kern
Date: 03/19/1998 05:00:18
[Warning: The following is a ball of hair. It will damage your brain.
Don't read it.]
So I decided to research the history and usage of WOPEN a bit more
before redefining it. It appears that in 4.4 it was used as follows:
1) On first open of the device, and on subsequent opens that block
waiting for carrier, WOPEN is set.
2) When the device is successfully opened, WOPEN is cleared, and
ISOPEN is set.
3) When CLOCAL is turned off and carrier is not present (CARR_ON is
off), ISOPEN is cleared and WOPEN is set. [Note: This is a lie.
See why below.] This has the effect of causing subsequent reads
and writes to block awaiting carrier.
4) On final close, if WOPEN is set (translation: since nothing can be
blocking in the open routine if this is a final close, this means
that the device must have transitioned to WOPEN as in #3 above),
DTR is always turned off, even if HUPCL is not set.
5) Things were carefully constructed such that any time WOPEN was
cleared, all processes that might have been waiting for carrier in
the open routine awoke, set WOPEN again, and went back to sleep.
I see several problems here:
* Using the hp300 port as a reference, the transition mentioned in #3
didn't actually occur, because dcaparam() copied cflag into the tty
state, causing the relevant test (now lines 836-842 of tty.c) to
always fail. This has the effect that ISOPEN is always set in
ttread() and ttwrite(), which in turn means that turning off CLOCAL
is like actually hanging up the other end of the line: subsequent
reads will return EOF, and writes will fail (unless of course you
turn on CLOCAL again first, or carrier is asserted). On a more
microscopic level, it means that the ISOPEN checks in these routines
are always true.
* If the transition had occurred, there wasn't actually a way to get
*back* to ISOPEN state, except to open the device again. (I suspect
this may be the origin of the `close-open to force mode change'
garbage in Kermit.) Furthermore, an unlucky process stuck waiting
for carrier in ttread() or ttwrite() would have had really bizarre
semantics. (It would have waited for another process to open the
device. If the open was non-blocking, the process in ttread() or
ttwrite() would most likely have returned EOF at this point. If the
open was blocking, it would most likely have gone back to sleep
right away waiting for data to be read or written.)
* Worse yet, consider the most common case where you turn off CLOCAL:
after successfully dialing a modem. There's a race condition here.
If you lose carrier before CLOCAL is turned off, then the transition
to WOPEN would have caused the process to simply block waiting for
carrier to be asserted again. The current behaviour will cause the
process to see EOF and exit, which is much more desirable.
* The check in #4 is part of the fairly bogus DTR handling in 4.4 --
namely, that it always turns off DTR when it loses carrier. Without
going into the gory detail of why I think the `always turn off DTR
if we lose carrier' algorithm is severely broken, I'll point out
that no non-BSD system I tested exhibited this behaviour. Always
leaving DTR asserted unless it's explicitly changed with TIOC?DTR or
TIOCM??? is much simpler and less `magic'.
As you can see, this is all much more complex that one might have
imagined.
Some of this mess (the DTR handling) has already been cleaned up. To
clean up the rest of it, I plan to do the following:
* Since the transition in #3 is unused, and its (apparently) intended
behaviour is highly dubious anyway, simply remove it.
* Rather than the tower of cards alluded to in #5, replace TS_WOPEN
with a count of processes (tp->t_wopen) waiting for carrier in the
open routine. This has the added advantage that (as mentioned in my
earlier mail) this value can be used if the open fails to see if
another process currently has the device in use, even if it hasn't
gotten to ISOPEN state yet, and thus determine whether the device
should be shut off.
* When deciding whether to reinitialize the device in the open
routine, check both TS_ISOPEN and t_wopen, so we don't stomp on
another process waiting for carrier.
I think that covers all the bases.