tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 5.2: something wrong with TCP retransmits?



> So overall, you are having serious loss problems.

Yes, packet loss was obviously involved.  It was an inter-city
connection over multiple large NSPs, so occasional packet loss is no
surprise.  The surprise was the TCP stack's failure to recover.

> TCP should cope, of course.

And that's the part that bothered me: it just went dead.  Completely
stopped retransmitting, for over ten minutes.

> Are you sure all the loss is between the tcpdump host and
> 216.46.14.122?

Not certain, of course, but it's likely.  That's 100Mbps, whereas the
next hop towards 216.46.14.122 is a consumer-grade DSL line.

> I would take a tcpdump on the sending machine.

I'm trying from a different piece of the 10.* network in question now;
if it hangs again I'll have full captures (this time I'm capturing on
both ends of the connection).

>> 15:52:41.989631 IP 10.0.7.14 > 216.46.14.122: icmp 64: echo request seq 0
>> 15:52:42.024922 IP 216.46.14.122 > 10.0.7.14: icmp 64: echo reply seq 0
> Those are short packets.  I've seen misterminated ethernets that work
> for short but not long.

None of it is real Ethernet, actually; termination is not relevant.
The parts under my control are all the twisted-pair that typically
passes for Ethernet these days; all the other parts are shared
large-carrier infrastructure, even less likely to be true Ethernet.

> I have been looking at TCP xplots from netbsd-5 for a while.   There
> are issues, but they are minor failures to be as aggressive as the
> spec permits; I've never seen something like this.

Neither have I.  That's why I found it so noteworthy.

I've tried artifically introducing packet loss by briefly turning off
net.inet.ip.forwarding on an intermediate host - it recovered fine.  I
then flooded the slowest hop on the path with ping -f -s 1000 and it
locked up for the duration of the ping but has now recovered.  I'm
suspecting I'll have to reproduce the original setup to make it
misbehave again.  I ought to be able to do that, though it'll mean a
few more days of delay.

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index