Subject: tty layer bogons from the depths of the abyss
To: None <tech-kern@NetBSD.ORG>
From: Charles M. Hannum <mycroft@mit.edu>
List: tech-kern
Date: 03/19/1998 05:00:18
[Warning: The following is a ball of hair.  It will damage your brain.
Don't read it.]


So I decided to research the history and usage of WOPEN a bit more
before redefining it.  It appears that in 4.4 it was used as follows:

1) On first open of the device, and on subsequent opens that block
   waiting for carrier, WOPEN is set.

2) When the device is successfully opened, WOPEN is cleared, and
   ISOPEN is set.

3) When CLOCAL is turned off and carrier is not present (CARR_ON is
   off), ISOPEN is cleared and WOPEN is set.  [Note: This is a lie.
   See why below.]  This has the effect of causing subsequent reads
   and writes to block awaiting carrier.

4) On final close, if WOPEN is set (translation: since nothing can be
   blocking in the open routine if this is a final close, this means
   that the device must have transitioned to WOPEN as in #3 above),
   DTR is always turned off, even if HUPCL is not set.

5) Things were carefully constructed such that any time WOPEN was
   cleared, all processes that might have been waiting for carrier in
   the open routine awoke, set WOPEN again, and went back to sleep.

I see several problems here:

* Using the hp300 port as a reference, the transition mentioned in #3
  didn't actually occur, because dcaparam() copied cflag into the tty
  state, causing the relevant test (now lines 836-842 of tty.c) to
  always fail.  This has the effect that ISOPEN is always set in
  ttread() and ttwrite(), which in turn means that turning off CLOCAL
  is like actually hanging up the other end of the line: subsequent
  reads will return EOF, and writes will fail (unless of course you
  turn on CLOCAL again first, or carrier is asserted).  On a more
  microscopic level, it means that the ISOPEN checks in these routines
  are always true.

* If the transition had occurred, there wasn't actually a way to get
  *back* to ISOPEN state, except to open the device again.  (I suspect
  this may be the origin of the `close-open to force mode change'
  garbage in Kermit.)  Furthermore, an unlucky process stuck waiting
  for carrier in ttread() or ttwrite() would have had really bizarre
  semantics.  (It would have waited for another process to open the
  device.  If the open was non-blocking, the process in ttread() or
  ttwrite() would most likely have returned EOF at this point.  If the
  open was blocking, it would most likely have gone back to sleep
  right away waiting for data to be read or written.)

* Worse yet, consider the most common case where you turn off CLOCAL:
  after successfully dialing a modem.  There's a race condition here.
  If you lose carrier before CLOCAL is turned off, then the transition
  to WOPEN would have caused the process to simply block waiting for
  carrier to be asserted again.  The current behaviour will cause the
  process to see EOF and exit, which is much more desirable.

* The check in #4 is part of the fairly bogus DTR handling in 4.4 --
  namely, that it always turns off DTR when it loses carrier.  Without
  going into the gory detail of why I think the `always turn off DTR
  if we lose carrier' algorithm is severely broken, I'll point out
  that no non-BSD system I tested exhibited this behaviour.  Always
  leaving DTR asserted unless it's explicitly changed with TIOC?DTR or
  TIOCM??? is much simpler and less `magic'.

As you can see, this is all much more complex that one might have
imagined.


Some of this mess (the DTR handling) has already been cleaned up.  To
clean up the rest of it, I plan to do the following:

* Since the transition in #3 is unused, and its (apparently) intended
  behaviour is highly dubious anyway, simply remove it.

* Rather than the tower of cards alluded to in #5, replace TS_WOPEN
  with a count of processes (tp->t_wopen) waiting for carrier in the
  open routine.  This has the added advantage that (as mentioned in my
  earlier mail) this value can be used if the open fails to see if
  another process currently has the device in use, even if it hasn't
  gotten to ISOPEN state yet, and thus determine whether the device
  should be shut off.

* When deciding whether to reinitialize the device in the open
  routine, check both TS_ISOPEN and t_wopen, so we don't stomp on
  another process waiting for carrier.

I think that covers all the bases.