Subject: Re: Networking speed
To: Mark White <mjw@celos.net>
From: Daniel Carosone <dan@geek.com.au>
List: tech-net
Date: 10/01/2004 10:05:44
--KbI68ipL6xvRMBYq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Sep 30, 2004 at 11:36:31PM +0100, Mark White wrote:
> > The problem can probably be seen quite easily with a packet capture
> > and tcptrace TSG plots, but you need to know what you're looking at.
>=20
> TSG recorded on NetBSD for MacOS->NetBSD transfers shows
> approx 10ms bursts of traffic (a few dozen segments),
> separated by gaps of a second or two.  Typical structure: at
> the end of a burst I see segments 1-2-6-7, followed by a
> second or so gap with no ACKs, then 3-4-5 out of order, and
> resends of 6-7.  Then a few ms of good traffic, and repeat.

This is definately some kind of packet loss; the out-of-order packets
are ones that were not received the first time.  If you run the
capture on OS/X, you should see them being sent.

The pauses are Nagle's algorithm; the sender doing the congestion
avoidance backoff referred to previously. You may also notice that the
first few segments after a restart are more widely spaced.

It's hard to say exactly without looking at it, but the lack of *any*
ack's is a little confusing there. You should see at least one more
re-ack when the out-of-order segment comes in. (just a little green
tick on the line)
=20
Does setting net.inet.tcp.delack_ticks=3D0 on netbsd help at all?

> Sometimes more than one range of segments is out-of-order,
> and occasionally I can see repeated ACKs of the last
> in-sequence segment in the gap.  Does this shed any more
> light?

That is more like what i'd expect to see.  Comparing plots from
captures taken at sender and receiver for the same session will be
interesting.

> Thanks for pointing out tcptrace, in any case. :-)

It is a wonderfully useful and simple tool.

> Trying several different cables and ports hasn't helped,
> BTW, but replugging and ifconfig ste0 down then up again
> occasionally makes the problem go away for a while.

You may possibly have a duplex issue, mac's do something slightly odd
with respect to Nway (Apple's wishful thinking of how the standard
should have been written, rather than how it was).  It would be worth
confirming that all ports are autonegotiating 100/full, both as
reported by the box and the switch, just in case.

But more likely you're just seeing the ste fail to handle a fast chain
of packets and dropping some; it's a well known problem, at least for
some quad cards that put four of them behind a ppb.  Is it hard to try
another NIC?

Another experiment to try: set the ste to 10 rather than 100. You may
actually get better throughput by avoiding nagle and slow-start.

--
Dan.

--KbI68ipL6xvRMBYq
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (NetBSD)

iD8DBQFBXJ9YEAVxvV4N66cRApCHAKDnCarJPoyyUOhb2hQGdNd1ypAMMQCfXGZK
yNcU6o9oFWqEfa0jNvqTumM=
=LXnT
-----END PGP SIGNATURE-----

--KbI68ipL6xvRMBYq--