NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: i386 home firewall/router/nat bottleneck diagnostics?



yancm%sdf.lonestar.org@localhost writes:

>> 1) check out tcp send/receive buffer sizing with sysctl
>
> I'm not entirely sure what I need to be looking for. I believe
> these are the variables I need to be concerned with?:
>   net.inet.tcp.recvbuf_auto = 0
>   net.inet.tcp.recvbuf_inc = 16384
>   net.inet.tcp.recvbuf_max = 1048576
>   net.inet.tcp.recvspace = 65536
>   net.inet.tcp.sendbuf_auto = 0
>   net.inet.tcp.sendbuf_inc = 8192
>   net.inet.tcp.sendbuf_max = 262144
>   net.inet.tcp.sendspace = 32768
>
> I first tried bumping the max and recvspace values...did not
> see a qualitative change.
>
> I also tried setting auto=1. Again, no noticable qualitative
> difference observed.

Setting auto (both), or bumping recvspace and sendspace to 131072 is
probably sufficient.  The issue can be that your TCP connections can be
limited by filling the pipe and then waiting for the ack.

>> 2) build pkgsrc/graphics/xplot, and use tcpdump2xplot to examine tcp's
>> behavior.  This is not that easy, but it's the best way to really
>> understand what's going on.
>
> At the moment I do not have X set up. This box sits in a corner
> with an old 10" monochrome CRT on top for when I can't ssh/putty
> into it.

(Grab xplot from graphics/xplot-devel instead; it's a little newer.
Make sure you get the patches from pkgsrc that I just committed - the
script had not caught up with recent perl, which no longer auto-assigns
split to @_, apparently.)

You don't need to run xplot on the box under test.  Basically, 

# tcpdump -w TCP.xplot tcp

on someplace, and  then do speed tests from that box (each way).  You
can use wget, or ttcp.

Then, you bring that tcpdump file back to a normal box (doesn't have to
be netbsd - just something you can build xplot on - linux, mac *BSD
works fine) and

  tcpdump -S -tt -r TCP.xplot | tcpdump2xplot

and then finally xplot on the .xplot files.

If you have a tcpdump file that shows the whole connection (including
the SYNs at beginning) and want to put it up where I can grab it, I'll
take a look.

I know the whole tcpdump2xplot stuff is a bit hard (and see the READMEs
in the xplot sources for how to use it - it's an old-school X program,
From the late 80s).  But it's the only way I know to understand what tcp
is actually doing.

>> 3) Try multiple tcp connections in parallel.  If you get a lot better
>> speed, then that's a hint that you have loss/buffer size issues, rather
>> than an interface bottleneck.
>
> Several connections does not seem to change the throughput. I do have
> slurm running and am able to watch the re0 interface fairly easily.

That's a clue that the issue is packet processing time and not loss
causing TCP to slow way down.  TCP is very sensitive to problems as a
consequence of the "loss == congestion" assumption.

>> 4) use netstat -i, and netstat -p, before and after a big transfer (save
>> them to files, and diff).
>
> What am I looking for in netstat -i? the only relevant changes seem
> to be Ierrs, Oerrs and Colls? I do not see any I/Oerrs. I do see a few
> Colls increasing on the rtk0 interface.

That sounds ok, then.  I didn't expect this to turn up trouble, but it's
quick to check.  Also look at dmesg to see if the kernel is printing any
errors (probably not).

> netstat -p tcp? Too many entries even with diff...not sure what I should
> be looking for. I'm probably unworthy, but...clues?

Basically, you're looking for counters that are increasing that
typically don't increase when things are ok.  You'll see a few
retransmissions, a few incoming segments with duplicate data, etc., but
only a few, not 10% of the packets.  concentrate on the stats under
packets received.  Also netstat -p ip, and look for error counters.

For example, non-zero counters on my machine, and the 16/1/2 are real
errors, but thats's out of 58M packets.

ip:
        58125407 total packets received
        16 bad header checksums
        1 with data size < data length
        2 with incorrect version number
        12961 fragments received
        34 fragments dropped after timeout
        6455 packets reassembled ok
        50589792 packets for this host
        2711144 packets forwarded (0 packets fast forwarded)
        263285 packets not forwardable
        47637071 packets sent from this host
        1284570 packets sent with fabricated ip header

Attachment: pgp5rKGLEwq3o.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index