Subject: kern/17421: local TCP connection hang
To: None <gnats-bugs@gnats.netbsd.org>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: netbsd-bugs
Date: 06/27/2002 23:03:16
>Number:         17421
>Category:       kern
>Synopsis:       local TCP connection hang
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jun 27 14:04:00 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Manuel Bouyer
>Release:        NetBSD 1.6_BETA1 sources from jun, 3
>Organization:
	LIP6
>Environment:
	
System: NetBSD swing 1.6_BETA1 NetBSD 1.6_BETA1 (SWING) #1: Mon Jun 3 18:10:57 CEST 2002 bouyer@swing:/home/src/src/sys/arch/i386/compile/SWING i386
Architecture: i386
Machine: i386
>Description:
	This machine is a amanda backup server (amanda-dev-* packages).
	When running amrecover it will contact the server though a TCP
	connection, after some exanges the server will send the dump file.
	amrecover will dup2() the socket to stdin fork "restore xbf 2 -"
	to extract the dump archive.
	If amrecover is run from a remote machine all is fine. But
	when it's run from the local machine, it stalls:
tcp        0  16384  swing.amidxtap         swing.800              ESTABLISHED
tcp        0      0  swing.800              swing.amidxtap         ESTABLISHED

	port 800 is the client side. Note that the server end has data waiting
	in the send queue, when the receive queue on the client side is empty.
	processes using these sockets are:
amanda   gzip       21128    1* internet stream tcp c0aad0f4 132.227.63.66:10083 <-> 132.227.63.66:800
root     restore    20894    0* internet stream tcp c0aad230 132.227.63.66:800 <-> 132.227.63.66:10083

and that's all. both gzip and restore are blocked on "netio".
A tcpdump on lo0 shows:
22:58:31.714915 swing.amidxtape > swing.800: P 840759878:840776262(16384) ack 819044265 win 16384 <nop,nop,timestamp 2276 0>
22:58:31.914888 swing.800 > swing.amidxtape: . ack 16384 win 65535 <nop,nop,timestamp 2277 2276>
22:58:31.914977 swing.amidxtape > swing.800: P 16384:32768(16384) ack 1 win 16384 <nop,nop,timestamp 2277 0>
22:58:32.114906 swing.800 > swing.amidxtape: . ack 32768 win 65535 <nop,nop,timestamp 2277 2277>
22:58:37.115114 swing.amidxtape > swing.800: P 32768:49152(16384) ack 1 win 16384 <nop,nop,timestamp 2287 0>
22:58:37.315084 swing.800 > swing.amidxtape: . ack 49152 win 65535 <nop,nop,timestamp 2288 2287>
22:58:37.315179 swing.amidxtape > swing.800: P 49152:65536(16384) ack 1 win 16384 <nop,nop,timestamp 2288 0>
22:58:37.515091 swing.800 > swing.amidxtape: . ack 65536 win 65535 <nop,nop,timestamp 2288 2288>
22:58:42.515306 swing.amidxtape > swing.800: P 65536:81920(16384) ack 1 win 16384 <nop,nop,timestamp 2298 0>
22:58:42.715277 swing.800 > swing.amidxtape: . ack 81920 win 65535 <nop,nop,timestamp 2298 2298>
22:58:42.715365 swing.amidxtape > swing.800: P 81920:98304(16384) ack 1 win 16384 <nop,nop,timestamp 2298 0>
22:58:42.915295 swing.800 > swing.amidxtape: . ack 98304 win 65535 <nop,nop,timestamp 2299 2298>
22:58:47.915494 swing.amidxtape > swing.800: P 98304:114688(16384) ack 1 win 16384 <nop,nop,timestamp 2309 0>
22:58:48.115465 swing.800 > swing.amidxtape: . ack 114688 win 65535 <nop,nop,timestamp 2309 2309>
22:58:48.115551 swing.amidxtape > swing.800: P 114688:131072(16384) ack 1 win 16384 <nop,nop,timestamp 2309 0>
22:58:48.315478 swing.800 > swing.amidxtape: . ack 131072 win 65535 <nop,nop,timestamp 2310 2309>
22:58:53.315692 swing.amidxtape > swing.800: P 131072:147456(16384) ack 1 win 16384 <nop,nop,timestamp 2320 0>
22:58:53.515658 swing.800 > swing.amidxtape: . ack 147456 win 65535 <nop,nop,timestamp 2320 2320>
22:58:53.515744 swing.amidxtape > swing.800: P 147456:163840(16384) ack 1 win 16384 <nop,nop,timestamp 2320 0>
22:58:53.715680 swing.800 > swing.amidxtape: . ack 163840 win 65535 <nop,nop,timestamp 2320 2320>
22:58:58.715888 swing.amidxtape > swing.800: P 163840:180224(16384) ack 1 win 16384 <nop,nop,timestamp 2330 0>
22:58:58.915860 swing.800 > swing.amidxtape: . ack 180224 win 65535 <nop,nop,timestamp 2331 2330>
22:58:58.915956 swing.amidxtape > swing.800: P 180224:196608(16384) ack 1 win 16384 <nop,nop,timestamp 2331 0>
22:58:59.115865 swing.800 > swing.amidxtape: . ack 196608 win 65535 <nop,nop,timestamp 2331 2331>
22:59:04.116080 swing.amidxtape > swing.800: P 196608:212992(16384) ack 1 win 16384 <nop,nop,timestamp 2341 0>
22:59:04.316047 swing.800 > swing.amidxtape: . ack 212992 win 65535 <nop,nop,timestamp 2342 2341>
22:59:04.316141 swing.amidxtape > swing.800: P 212992:229376(16384) ack 1 win 16384 <nop,nop,timestamp 2342 0>
22:59:04.516058 swing.800 > swing.amidxtape: . ack 229376 win 65535 <nop,nop,timestamp 2342 2342>
22:59:09.516272 swing.amidxtape > swing.800: P 229376:245760(16384) ack 1 win 16384 <nop,nop,timestamp 2352 0>
22:59:09.716246 swing.800 > swing.amidxtape: . ack 245760 win 65535 <nop,nop,timestamp 2352 2352>
22:59:09.716348 swing.amidxtape > swing.800: P 245760:262144(16384) ack 1 win 16384 <nop,nop,timestamp 2352 0>
22:59:09.916250 swing.800 > swing.amidxtape: . ack 262144 win 65535 <nop,nop,timestamp 2353 2352>
22:59:14.916466 swing.amidxtape > swing.800: P 262144:278528(16384) ack 1 win 16384 <nop,nop,timestamp 2363 0>
22:59:15.116431 swing.800 > swing.amidxtape: . ack 278528 win 65535 <nop,nop,timestamp 2363 2363>
22:59:15.116520 swing.amidxtape > swing.800: P 278528:294912(16384) ack 1 win 16384 <nop,nop,timestamp 2363 0>
22:59:15.316442 swing.800 > swing.amidxtape: . ack 294912 win 65535 <nop,nop,timestamp 2364 2363>
22:59:20.316659 swing.amidxtape > swing.800: P 294912:311296(16384) ack 1 win 16384 <nop,nop,timestamp 2374 0>
22:59:20.516628 swing.800 > swing.amidxtape: . ack 311296 win 65535 <nop,nop,timestamp 2374 2374>
22:59:20.516729 swing.amidxtape > swing.800: P 311296:327680(16384) ack 1 win 16384 <nop,nop,timestamp 2374 0>
22:59:20.716642 swing.800 > swing.amidxtape: . ack 327680 win 65535 <nop,nop,timestamp 2374 2374>


>How-To-Repeat:
	I tried to reproduce this problem with a simple test case, but
	I coult'nt. Even sending a dump file though gzip and TCP to restore
	works as expected, so it may be timing-related.
	however I can reliably reproduce it with amanda.
>Fix:
	Unknow.
>Release-Note:
>Audit-Trail:
>Unformatted: