tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [patch] ixg(4): Intel 82599 10-gigabit ethernet

On Tue, Aug 09, 2011 at 06:58:26PM -0500, David Young wrote:
> With all of the hardware acceleration options turned on and with
> iperf(1) bound to the 0th CPU and running 4 threads, TCP transmission
> speeds of 5 Gb/s are possible (receiver was a FreeBSD box).  The maximum
> TCP receive speed I have observed is 3.6 Gb/s (transmitter was FreeBSD.)

I am curious how performance changes with large frames.  In particular,
I would expect receive performance might be somewhat better, since we
don't have any form of aggregated handling of packets from the same TCP
stream, like Linux does (they call it "Large Receive Offload" which is
rather misleading), though on send we can use this device's large send

(Note that 9K frames are a *bad* idea.  Try a frame size that actually
fits neatly in two pages, like, say, 8K - overhead)

> If I don't bind iperf(1) to one CPU, then lock contention on my 2-CPU
> test boxes gobbles up a lot of time and TCP transmission performance
> plummets.  Our network stack doesn't do SMP well. :-/

If you have a few minutes to try it, I am curious whether reverting my
change to suck all packets off the input queue under one hold of the lock
(in ip_input.c, about a year ago) has any effect -- and if so, what effect.
I'd expect it might make things much worse to revert that, but I would
think we finally have a case where we should see _some_ effect, anyway.


Home | Main Index | Thread Index | Old Index