NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)



Hi,

On 18.08.22 07:14, B. Atticus Grobe wrote:
Forgive me if I've misread, but it seems like you're running 3 VMs with
2 vCPUs each on a dual-core processor. ZFS itself is also processor and RAM
heavy (on every OS I've used it on, never have used it on NetBSD).

So, it seems to me like you've taken an already not particularly fast
processor and applied a really heavy load to it, especially if you're doing
any zRAID or anything else.

that's right - i have the habit to get the last out of low end hardware ;-) Especially this device is interesting - when SpeedStep kicks in and clocks the CPU down to 800 Mhz, the whole device has a power consumption of less than 4 Watts. That's not insignificant for a full-fledged open source x86_64 server these days.

I don't use a ZRAID but an ordinary striped ZPOOL with only a single SSD, but that probably doesn't change the fact that I'm scratching the edge of what's possible with this limited hardware.


Someone actually familiar with the NetBSD kernel can correct me on this,
but I'm fairly sure that the kernel itself is multithreaded, and handles
interrupts on multiple cores.

I would guess that with the single vCPU VMs, the processor was able to
keep up, but now you've doubled down on the workload, and considerably
increased the number of context switches that have to happen.

This realization is slowly coming to me as well - especially based on the concerns expressed about my setup.

I have made another attempt - analogous to my previous tests with the only difference that I have completely disabled ZFS and the QCOW2 files are on a FFSv2 with WAPBL.

## 1 Core per VM

1: ~45 MB/s  (55% sys, 2% user)
2: ~86 MB/s  (94% sys, 4% user)
3: ~80 MB/s  (95% sys, 5% user)

## 2 Cores per VM

1: ~35 MB/s  (67% sys, 5% user)
2: <1 MB/s  (99% sys, 0.2% user)
3: <1 MB/s  (100% sys, 0.0% user)

In any case, the results with the "2 Cores" already confirms that in the cases where the performance comes to a standstill, >99% of the CPU load is on the system side and thus (probably?) a lot of expensive context switches or other "organzational stuff" takes place, i.e. absolute overload.

I still can't explain why the performance in the "1 cores" case for the otherwise identical setup is significantly below the values of ZVOL or QCOW2 on ZFS. Could this be caching effects or does iostat count certain things twice in the case of ZFS?


Or maybe I'm completely wrong on every point. I'd certainly take a look
at your CPU usage though.


I think you are not wrong... as I realize, I probably need to lower my expectations for the device. And on the limited hardware probably better to use only one CPU core each for the VMs to reduce context switches as these seem to be the reason for the performance drops. I will concentrate my tests on this for now and try to find out where the respective limits of ZVOL, QCOW2 on ZFS and QCOW2 on FFS are in this constellation.

Kind regards
Matthias

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature



Home | Main Index | Thread Index | Old Index