Subject: Re: about UDP tests
To: Eric Auge <eau@phear.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-net
Date: 05/12/2005 11:39:34
In message <d5vn9s$j2u$1@sea.gmane.org>Eric Auge writes
>Hello,
>
>Ok here is another thought :
>
>after some more measurements, and 2 box connected w/ a crosscable,
>I had a really strange big loss of a thousand packets, that are not 
>reported even in the internal counters  (systat inet.ip 1 :run).
>
>sending 10 000 000 udp packets at 40 000 p/s rate :
>
>I'm speaking about internal counters only :
>
>sender reports 10 000 000 packets sent
>received reports 9 993 728 (missing 6272)
>
>and my udp server test code reports
>6272 missing.
>
>It doesn't happend all the time, just since i'm stress testing.


Eric,

If I was in your shoes, and I wanted to achieive predictable throughput,
I would start by  gaining a firm quantitative understanding of:

	1) the application
	2) the OS,
	3) the hardware (PU, memory, I/O bus,  and NIC)
	4) the other applications which can consume hardware resources.


Depending on exactly what values you have for those data-points, you
could run into trouble at any of the following well-known bottlenecks:

	a) The network link or switches
	b) Inside your NIC
	c) between the NIC and its (DMA) memory buffers
	d) latency issues between the NIC and CPU: that, is,
	   the NIC hardware may not support adequate
	   memory buffering for the aggregate load on other system
	   components.
            Note that good NICs and drivers, you can also trade off
	    latency against interrupt rate.  Here, you are trading
	    off higher latency and packet-queue depth, for increased
	   througphut. So another way to ask this is: what  interrupt rate
	   is your NIC going to create,  at the desired throughput rate?
	   Do you have enough CPU for a high rate? Do you have adequate
	   buffering for a low rate?
	   
        e) Between the NIC and IP (or other network-layer protocol).
	   In your case, that's the ipintrq depth.

	f) Inside the transport-level protocol. In your case, you
	   are using UDP with a single stream, so the only relevant
	   limit here should be the socket receive queue.

	g) OS notification of the application (i.e,. select() overheads);
	   which is also coupled to the next possibility, viz:

	f) CPU contention. When other applications (processes) are running,
	   and data is arriving, how long does it take for the OS
	   to switch back to your process and start it running?

When I first read your message, I thought the first thing to
investigate was limit (f) above.  the obvious test is to write a
CPU-eating program -- int main() { while (1); } -- and see how running
that program impacts your idle-machine, 6e4 packet/sec case.

I'd modify the UDP receiver to print recevied packet rate once per
second, monitor the system with systat -w1 vmstat, and look for
consistent changes between the 60k packet/sec state and the 20k
pkt/sec case (or whatever your current numbers are).


[...]

]>the nic card used on the server side is a realtek (rtk0).

Just a guess, I'd guess modern x86 machines should have no trouble
keeping up with 100Mbit traffic at the rates you describe.  OTOH,
100Mbit realtek NICs have an reputation for performance which is not
the best.

But note, if you're running into bottlenecks from the head of the
alphabetic list above, systat may not show the culprit.