Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: 50% slowdown on 2 processor XEN 4.1.3 vs 1 processor system
On Mon, 26 Nov 2012 11:39:28 +0100
Roger Pau Monné <roger.pau%citrix.com@localhost> wrote:
> Hello,
>
> On 26/11/12 10:00, Harry Waddell wrote:
> >
> > I just built a new server with 2 x E5-2630 processors and was comparing the
> > performance to a nearly identical xen server with 1 x E5-1660 processor and
> > found that on a per core basis, instead of being about 50% faster ( as one
> > would expect from the clock speed given that the architecture is nearly
> > identical ), the E5-1660 system is %300 faster, so I ran some benchmarks
> > and started looking for a pattern.
> >
> > Both systems run NetBSD amd64 6.0-STABLE with xen 4.1.3. The 2
> > processor system is running only 1/2 as fast with the XEN3_DOM0 kernel
> > as with the GENERIC kernel, e.g. with simple single threaded benchmarks
> > like dhrystone and whetstone. ( 22624434.0 vs 10416667.0 dhrystones -- It's
> > also clearly slower when compiling, etc... ) The one processor system works
> > just fine.
>
> I'm a little bit lost here, are you comparing the speed of the Dom0 vs a
> baremetal install?
Yes. A xen domU and a xen dom0 are both less than 1/2 the speed of a baremetal
install with a GENERIC kernel. I have a single physical cpu system that is very
similar which does not show this disparity.
>
> > So far, I've tried:
> >
> > 1. a netbsd 6.0 release XEN3_DOM0 kernel
> >
> > 2. run the test in a netbsd domU
> >
> > 3. disabled HT and NUMA
> >
> > 4. used xl to create pools based on NUMA nodes and assigned/pinned dom0 to
> > one
> >
> > 5. compiled and installed xen 4.2.0rc4
>
> Have you checked the output of xl info, to see if the number of CPUs,
> NUMA nodes and clock speed is consistent?
>
looks fine, at least to me.
release : 6.0
version : NetBSD 6.0 (XEN3_DOM0)
machine : amd64
nr_cpus : 24
max_cpu_id : 23
nr_nodes : 2
cores_per_socket : 6
threads_per_core : 2
cpu_mhz : 2300
hw_caps :
bfebfbff:2c100800:00000000:00003f40:17bee3ff:00000000:00000001:00000000
virt_caps : hvm hvm_directio
total_memory : 65508
free_memory : 60568
sharing_freed_memory : 0
sharing_used_memory : 0
free_cpus : 0
xen_major : 4
xen_minor : 1
xen_extra : .3
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
xen_commandline : dom0_mem=4096M dom0_max_vcpus=1
cc_compiler : gcc (NetBSD nb2 20110806) 4.5.3
cc_compile_by : root
cc_compile_domain :
cc_compile_date : Mon Nov 26 08:09:33 UTC 2012
xend_config_format : 4
> Also, I would recommend disabling any kind of energy savings in the BIOS
> and trying again.
>
I had the cpu power controls set to "energy efficient" and "balanced
performance" in the BIOS. I just disabled all the cpu power management and it
doesn't seem to have any effect.
> >
> > and nothing seems to influence the disparity in the performance.
> >
> > Has anyone else seen similar behavior, or does anyone have any suggestions
> > on how to proceed? Removing a cpu is kind of dangerous, so I'd like to
> > avoid that, but if there is a good xen dom0 linux live CD, or something
> > similar, I could try booting and testing under linux? It was pretty
> > difficult getting netbsd to install and boot on my 6TB raid 5 that
> > I'd hate to blow that away for such a test, but I do have usb keys I could
> > install into etc. I assume that if linux dom0's had this issue I would have
> > found something about it during my searches, so I'm guessing this is a BSD
> > issue, but that's still just an assumption, so if there's an easyish way to
> > test that theory, I'll do it.
>
> I've checked some time ago the performance of Linux vs NetBSD as a PV
> guests on both NetBSD and Linux Dom0, and the difference was not that
> big: http://www.slideshare.net/xen_com_mgr/free-and-net-bsd-xen-roadmap
> (see the last part of the slides for the perf results)
>
>
since the slides use sysbench, I thought maybe I'll try sysbench's cpu test.
XEN3_DOM0 kernel:
-----------------
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 10000
Test execution summary:
total time: 27.0904s
total number of events: 10000
total time taken by event execution: 27.0786
per-request statistics:
min: 2.70ms
avg: 2.71ms
max: 3.39ms
approx. 95 percentile: 2.71ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 27.0786/0.00
GENERIC
-------
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 10000
Test execution summary:
total time: 12.5222s
total number of events: 10000
total time taken by event execution: 12.5140
per-request statistics:
min: 1.25ms
avg: 1.25ms
max: 1.35ms
approx. 95 percentile: 1.25ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 12.5140/0.00
---------
It doesn't seem to matter how I measure it, the cpu performance is much lower
running the XEN3_DOM0 kernel than GENERIC, but only on a two physical processor
system. Here are the result for a very similar, but faster clock speed, 1
processor system's dom0. Please note,
this systems is actually in use running several domU's.
release : 6.0_STABLE
version : NetBSD 6.0_STABLE (XEN3_DOM0)
machine : amd64
nr_cpus : 12
nr_nodes : 1
cores_per_socket : 6
threads_per_core : 2
cpu_mhz : 3300
hw_caps :
bfebfbff:2c100800:00000000:00003f40:13bee3ff:00000000:00000001:00000000
virt_caps : hvm hvm_directio
total_memory : 65508
free_memory : 54439
free_cpus : 0
xen_major : 4
xen_minor : 1
xen_extra : .3
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 10000
Test execution summary:
total time: 9.9198s
total number of events: 10000
total time taken by event execution: 9.9155
per-request statistics:
min: 0.98ms
avg: 0.99ms
max: 40.91ms
approx. 95 percentile: 0.99ms
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 9.9155/0.00
I suppose I could still have missed something, but I'm pretty sure this is a
bug.
What I don't know is if it's a XEN bug or a NetBSD bug yet, so I'm not sure
where to
submit it.
Also, I booted the XEN live CD based on debian and XEN 3.2, but the os is too
old to
support the i350 ethernet on this new system, so without a network, I couldn't
get what I needed to benchmark anything. I may be able to work around this with
a USB ethernet device, but right now, I'm doing all of this remotely using IPMI.
Thanks for looking into this.
Harry Waddell
Home |
Main Index |
Thread Index |
Old Index