Subject: bin/28171: telnet can spin in infinite loop doing syscalls
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <he@uninett.no>
List: netbsd-bugs
Date: 11/10/2004 18:35:00
>Number:         28171
>Category:       bin
>Synopsis:       telnet can spin in infinite loop doing syscalls
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Nov 10 18:35:00 +0000 2004
>Originator:     Havard Eidnes
>Release:        NetBSD 1.6.2_STABLE
>Organization:
	UNINETT AS
>Environment:
System: NetBSD smistad.uninett.no 1.6.2_STABLE NetBSD 1.6.2_STABLE (GENERIC) #0: Wed Sep 29 12:10:04 CEST 2004 he@smistad.uninett.no:/usr/src/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
	Some of our users run telnet from scripts, possibly from
	within expect.  Once in a while one of these telnet processes
	go "wild", as shown here:

load averages:  2.38,  2.38,  2.44                                     19:12:24
145 processes: 1 runnable, 131 sleeping, 7 stopped, 5 zombie, 1 on processor
CPU states:  0.5% user, 39.3% nice, 59.2% system,  0.5% interrupt,  0.5% idle
Memory: 804M Act, 1080K Inact, 2292K Wired, 25M Exec, 550M File, 106M Free
Swap: 1500M Total, 1500M Free

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
16812 trond     70    4   876K  620K RUN      682:52 83.98% 83.98% telnet

	While in it's this state, none of stdin/stdout are present, as
	shown here:

# fstat -p 16812
USER     CMD          PID   FD MOUNT       INUM MODE         SZ|DV R/W
trond    telnet     16812   wd /home     587776 drwxr-xr-x    1024 r 
trond    telnet     16812    0 -         -        none    -
trond    telnet     16812    1 -         -        none    -
trond    telnet     16812    2 -         -        none    -
trond    telnet     16812    3* internet stream tcp
trond    telnet     16812    4* pipe 0xc16dd488 -> 0xc16dd388 w

	A ktrace / kdump reveals that it's in an infinite loop doing
	ioctl and write calls in quick succession ("kdump -R" output):

 16812 telnet   0.000795 CALL  ioctl(0,TIOCSETAW,0xbfbfda08)
 16812 telnet   0.000032 RET   ioctl -1 errno 25 Inappropriate ioctl for device
 16812 telnet   0.000011 CALL  write(0x1,0x8106060,0x96)
 16812 telnet   0.000010 RET   write -1 errno 5 Input/output error
 16812 telnet   0.000009 CALL  ioctl(0,TIOCSETAW,0xbfbfda08)
 16812 telnet   0.000009 RET   ioctl -1 errno 25 Inappropriate ioctl for device
 16812 telnet   0.000009 CALL  write(0x1,0x8106060,0x96)
 16812 telnet   0.000009 RET   write -1 errno 5 Input/output error
 16812 telnet   0.000010 CALL  ioctl(0,TIOCSETAW,0xbfbfda08)
 16812 telnet   0.000008 RET   ioctl -1 errno 25 Inappropriate ioctl for device
 16812 telnet   0.000009 CALL  write(0x1,0x8106060,0x96)
 16812 telnet   0.000010 RET   write -1 errno 5 Input/output error
 16812 telnet   0.000008 CALL  ioctl(0,TIOCSETAW,0xbfbfda08)
 16812 telnet   0.000009 RET   ioctl -1 errno 25 Inappropriate ioctl for device
 16812 telnet   0.000009 CALL  write(0x1,0x8106060,0x96)
 16812 telnet   0.000009 RET   write -1 errno 5 Input/output error
 16812 telnet   0.000009 CALL  ioctl(0,TIOCSETAW,0xbfbfda08)
 16812 telnet   0.000033 RET   ioctl -1 errno 25 Inappropriate ioctl for device
 16812 telnet   0.000010 CALL  write(0x1,0x8106060,0x96)
 16812 telnet   0.000009 RET   write -1 errno 5 Input/output error

	Attaching to the process using gdb reveals the loop it is in
	(yes, I had it linked statically with debug already...):

(gdb) where
#0  0x80508de in ttyflush (drop=0) at /usr/src/usr.bin/telnet/terminal.c:166
#1  0x804c820 in TerminalNewMode (f=-1)
    at /usr/src/usr.bin/telnet/sys_bsd.c:449
#2  0x8050a08 in setcommandmode () at /usr/src/usr.bin/telnet/terminal.c:249
#3  0x804cb33 in deadpeer (sig=13) at /usr/src/usr.bin/telnet/sys_bsd.c:873
#4  0xbfbfdfdc in ?? ()
#5  0x804c09d in netflush () at /usr/src/usr.bin/telnet/network.c:155
#6  0x804d034 in process_rings (netin=1, netout=1, netex=1, ttyin=0, 
    ttyout=150, poll=1) at /usr/src/usr.bin/telnet/sys_bsd.c:1221
#7  0x804ff89 in Scheduler (block=0) at /usr/src/usr.bin/telnet/telnet.c:2268
#8  0x8050122 in telnet (user=0xbfbfdd75 "trond")
    at /usr/src/usr.bin/telnet/telnet.c:2356
#9  0x804b0be in tn (argc=0, argv=0xbfbfdc6c)
    at /usr/src/usr.bin/telnet/commands.c:2581
#10 0x804becf in main (argc=2, argv=0xbfbfdcf8)
    at /usr/src/usr.bin/telnet/main.c:369
#11 0x804825c in ___start ()
(gdb) 

	
>How-To-Repeat:
	Not exactly certain how to provoke this.

>Fix:
	
	This fix checks the return value from tcsetattr(), which I
	think would break this loop (TCSADRAIN is defined as TIOCSETAW
	earlier in sys_bsd.c):

Index: sys_bsd.c
===================================================================
RCS file: /cvsroot/src/usr.bin/telnet/sys_bsd.c,v
retrieving revision 1.18
diff -u -r1.18 sys_bsd.c
--- sys_bsd.c	11 Feb 2002 11:00:07 -0000	1.18
+++ sys_bsd.c	10 Nov 2004 18:23:19 -0000
@@ -444,7 +444,10 @@
 	     * Wait for data to drain, then flush again.
 	     */
 #ifdef	USE_TERMIO
-	    tcsetattr(tin, TCSADRAIN, &tmp_tc);
+	    if (tcsetattr(tin, TCSADRAIN, &tmp_tc) == -1) {
+		perror("tcsetattr");
+		exit(1);
+	    }
 #endif	/* USE_TERMIO */
 	    old = ttyflush(SYNCHing|flushout);
 	} while (old < 0 || old > 1);