Subject: telnet spins to dead tty
To: None <tech-userlevel@netbsd.org>
From: john heasley <heas@shrubbery.net>
List: tech-userlevel
Date: 10/25/2002 07:58:13
i have a set of scripts that use expect to login to devices via telnet
(or ssh).  if the scripts are not configured properly, it is possible
that the expect script deadlocks waiting for the output it expects
once a login (username and password) is successful.

a timeout ensues and the expect script closed the tty and waits for
the process for the child to exit.  and sometime after this, the device
closes the connection due to inactivity.

here's the telnet issue; telnet takes a SIGPIPE and spins out of control
trying to flush the tty facing expect, terminal.c:ttyflush().

#0  ttyflush (drop=0) at /home/src/usr.bin/telnet/terminal.c:159
#1  0x804ed93 in TerminalNewMode (f=-1)
    at /home/src/usr.bin/telnet/sys_bsd.c:445
#2  0x8053cc8 in setcommandmode () at /home/src/usr.bin/telnet/terminal.c:249
#3  0x804f143 in deadpeer (sig=13) at /home/src/usr.bin/telnet/sys_bsd.c:869
#4  0x480fc1f0 in __sigtramp_sigcontext_1 () from /usr/lib/libc.so.12
#5  0x804e40a in netflush () at /home/src/usr.bin/telnet/network.c:145

so, the telnet process:

COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
telnet  28617 heas    0u  VBAD                         (revoked)
telnet  28617 heas    1u  VBAD                         (revoked)
telnet  28617 heas    2u  VBAD                         (revoked)
telnet  28617 heas    3u  IPv4             0t0     TCP no PCB, CANTSENDMORE, CANTRCVMORE
telnet  28617 heas    4w  VREG    0,0      331 1022641 / (/dev/wd0a)

the write(fd=1) within TerminalWrite() returns -1 (errno = 5).  but,
ttyflush() only communicates the following in it's return value:

 *              Return value:
 *                      -1: No useful work done, data waiting to go out.
 *                       0: No data was waiting, so nothing was done.
 *                       1: All waiting data was written out.
 *                       n: All data - n was written out.

since ttyflush() isnt designed to return a "permanent failure" result,
the callee just calls it again, forever.

i suppose the question is how best to fix this.  teach ttyflush() about
permanent errors and the folks that call its new return value?