tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: SACK oddity
> There is an open SACK issue PR kern/55800 (2020) that renders SACK
> unusable in the field. It may be related to this observation, though
> I did not check.
> Looks like the SACK code needs a check.
I now suspect it is unrelated. When I disable SACK
(net.inet.tcp.sack.enable=0) on the connecting host, the ssh connection
wedges at, from what I can tell, the same point. tcpdump shows no SACK
in the SYN packets, and no SACK options, but it still hangs:
09:12:12.982380 IP me.54726 > them.22: S 1678357181:1678357181(0) win 32768 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0>
09:12:12.983247 IP them.22 > me.54726: S 1619729590:1619729590(0) ack 1678357182 win 32768 <mss 1460,nop,wscale 3,nop,nop,timestamp 1 0>
09:12:12.983318 IP me.54726 > them.22: . ack 1 win 33580 <nop,nop,timestamp 0 1>
09:12:12.985618 IP them.22 > me.54726: P 1:75(74) ack 1 win 4197 <nop,nop,timestamp 1 0>
09:12:12.986473 IP them.22 > me.54726: . 75:1523(1448) ack 1 win 4197 <nop,nop,timestamp 1 0>
09:12:12.986536 IP me.54726 > them.22: . ack 1523 win 32058 <nop,nop,timestamp 0 1>
09:12:12.986590 IP them.22 > me.54726: P 1523:2971(1448) ack 1 win 4197 <nop,nop,timestamp 1 0>
09:12:13.180955 IP me.54726 > them.22: . ack 2971 win 30610 <nop,nop,timestamp 0 1>
09:12:13.181770 IP them.22 > me.54726: P 2971:2995(24) ack 1 win 4197 <nop,nop,timestamp 1 0>
09:12:13.333237 IP me.54726 > them.22: . ack 2995 win 33580 <nop,nop,timestamp 1 1>
09:12:13.896283 IP me.54726 > them.22: . 1:1149(1148) ack 2995 win 33580 <nop,nop,timestamp 2 1>
09:12:13.896344 IP me.54726 > them.22: . 1149:2297(1148) ack 2995 win 33580 <nop,nop,timestamp 2 1>
09:12:13.897171 IP them.22 > me.54726: . ack 2297 win 3910 <nop,nop,timestamp 2 2>
09:12:13.897253 IP me.54726 > them.22: P 2297:3329(1032) ack 2995 win 33580 <nop,nop,timestamp 2 1>
09:12:13.897954 IP them.22 > me.54726: . ack 3329 win 4197 <nop,nop,timestamp 2 2>
09:12:15.437326 IP them.22 > me.54726: P 2995:3723(728) ack 3329 win 4197 <nop,nop,timestamp 5 2>
09:12:15.631377 IP me.54726 > them.22: . ack 3723 win 33580 <nop,nop,timestamp 5 5>
09:12:28.770436 IP me.54726 > them.22: P 3329:3405(76) ack 3723 win 33580 <nop,nop,timestamp 31 5>
09:12:30.263923 IP me.54726 > them.22: P 3329:3405(76) ack 3723 win 33580 <nop,nop,timestamp 34 5>
09:12:30.264757 IP them.22 > me.54726: . ack 3405 win 4197 <nop,nop,timestamp 35 31>
(nothing further for over two minutes)
Furthermore, I have a little bit more information. In normal
operation, `them' establishes connections for other things, notably
live backup and two VPN setups. `me' participates in the VPN setups,
and has been trying to establish VPN connections to `them'.
The interesting thing is, snooping on `me', I see then SYNs going out
and nothing at all coming back. Even though they're injected into the
middle of the above trace (I filtered them) and thus the host is known
reachable at the time.
To me, this says that the daemon handling _those_ connections is (in
contrast to the ssh daemon) wedged hard enough that it's not even
accepting connections, so the listen queue has filled up. This, in
turn, makes it seem likely to me that the issue is that userland has
somehow partially wedged. My own guess is that anything still in core
is working but it's somehow unable to page anything in. But I got
remote hands to put a monitor on the console and sent me a picture, and
it's not showing disk errors, which was my only theory.
I'm not sure what's wrong, but I fear the production sysadmin in me who
wants the machine back wins over the scientist in me who wants to
understand the failure mode; I'll ask the remote hands to reboot it.
But I see no reason to think SACK is anything but a distraction, with
the SACK that looked odd to me just being D-SACK in action. (If I'd
already been familiar with D-SACK it wouldn't've even looked odd.)
/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML mouse%rodents-montreal.org@localhost
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Home |
Main Index |
Thread Index |
Old Index