Subject: network backend improvements
To: None <port-xen@NetBSD.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 10/03/2005 00:11:16
Hi,
I've commited changes to the network backend and frontend, which reduce the
number of hypercalls and interrupts, and avoids some unneeded copy when
packets are sent/received.
I did some performances tests on a dual-CPU 350Mhz Pentium II:
cpu0: Intel Pentium II (686-class), 350.80 MHz, id 0x652
cpu0: features 183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 183fbff<PGE,MCA,CMOV,PAT,PSE36,MMX>
cpu0: features 183fbff<FXSR>
cpu0: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
cpu0: L2 cache 512 KB 32B/line 4-way
cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu0: 32 page colors
I used ttcp in a routing environnement (no bridge, dom0 is used as a router).
The linux domU is running:
Linux sl4 2.6.11.10-xenU #1 Sun May 22 11:42:16 BST 2005 i686 i686 i386 GNU/Linux
The NetBSD domU is running an up-to-date current kernel.
With ipf enabled (no rules loaded):
before changes after changes
netbsd-domU -> dom0 13130 KB/sec 10972 KB/sec
dom0 -> netbsd-domU 10076 KB/sec 11172 KB/sec
linux-domU -> dom0 12786 KB/sec 12530 KB/sec
dom0 -> linux-domU 11978 KB/sec 13817 KB/sec
netbsd-domU -> linux-domU 9712 KB/sec 9722 KB/sec
linux-domU -> netbsd-domU 7710 KB/sec 8160 KB/sec
With ipf disabled:
before changes after changes
netbsd-domU -> dom0 14357 KB/sec 17745 KB/sec
dom0 -> netbsd-domU 13113 KB/sec 15175 KB/sec
linux-domU -> dom0 16239 KB/sec 22179 KB/sec
dom0 -> linux-domU 18839 KB/sec 20005 KB/sec
netbsd-domU -> linux-domU 11122 KB/sec 12307 KB/sec
linux-domU -> netbsd-domU 7369 KB/sec 12647 KB/sec
The poor results with ipf enabled is because it will cause the packet
to be copied when it comes from the network backend (the first thing
done in the ipf input path is m_makewritable(), and it looks like
m_makewritable() is less efficient than memcpy()). With ipf disabled, the
performance gain is appreciable.
I also did some minimal tests in a bridge setup with ttcp between a domU and
an external box (in which case there should be no packet copied in domain0
at all), the system CPU usage in domain0 shown by top is reduced by about
20%.
Note that to take full avantage of the changes I did in dom0, you have to
build your dom0 kernel with options MCLSHIFT=12 (do a full clean, this option
isn't defopt'ed). This cause the mbuf cluster storage to be exactly one
page (as opposed to half a page with the default value).
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--