Re: nvmm users - experience

To: "Mathew, Cherry G.*" <c%bow.st@localhost>, Robert Nestor <rnestor%mac.com@localhost>
Subject: Re: nvmm users - experience
From: Matthias Petermann <mp%petermann-it.de@localhost>
Date: Wed, 24 May 2023 12:11:01 +0200

Hi Mathew,

On 23.05.23 15:11, Mathew, Cherry G.* wrote:


     MP> I came across Qemu/NVMM more or less out of necessity, as I had
     MP> been struggling for some time to set up a proper Xen
     MP> configuration on newer NUCs (UEFI only). The issue I encountered
     MP> was with the graphics output on the virtual host, meaning that
     MP> the screen remained black after switching from Xen to NetBSD
     MP> DOM0. Since the device I had at my disposal lacked a serial
     MP> console or a management engine with Serial over LAN
     MP> capabilities, I had to look for alternatives and therefore got
     MP> somewhat involved in this topic.

     MP> I'm using the combination of NetBSD 9.3_STABLE + Qemu/NVMM on
     MP> small low-end servers (Intel NUC7CJYHN), primarily for classic
     MP> virtualization, which involves running multiple independent
     MP> virtual servers on a physical server. The setup I have come up
     MP> with works stably and with acceptable performance.

I have a follow-on question about this - Xen has some config tooling
related to startup - so you can say something like

'xendomains = dom1, dom2' in /etc/rc.conf, and these domains will be
started during bootup.

If you did want that for nvmm, what do you use ?

Unfortunately, I didn't find anything suitable and was in a big hurry tomake the issue controllable for me. Therefore I wrote a shellscriptquick and dirty. It encapsulates the aspects of starting VMs from thecommand line and from an rc script, creating appropriate Unix domainsockets to serve the guest's serial terminal and the Qemu frontend'smonitoring console. If you want to have a look at it, I have uploaded ithere (unfortunately without documentation and with a big warning that itis all done with a hot needle):


	https://forge.petermann-it.de/mpeterma/vmctl


     MP> Scenario:

     MP> I have a small root filesystem with FFS on the built-in SSD, and
     MP> the backing store for the VMs is provided through ZFS ZVOLs. The
     MP> ZVOLs are replicated alternately every night (full and
     MP> incremental) to an external USB hard drive.

Are these 'zfs send' style backups ? or is the state on the backup USB
hard drive ready for swapping, if the primary fails for eg ?

Yes, I use zfs send, saving the stream from zfs send to files on the USBdrive for my regular backups. So they are not directly usable. The ideais interesting though - I chose this way back then because I do it quitesimilar on my FFS systems with dump and the incremental aspect wasimportant to me. On the other hand, I've also tested pulling a zfs sendof all ZVOLs from the mini-server to my laptop, and then playing aroundlocally with Qemu/nvmm with a "production copy".


     MP> There are a total of 5 VMs:

     MP>     net (DHCP server, NFS and SMB server, DNS server) app
     MP> (Apache/PHP-FPM/PostgreSQL hosting some low-traffic web apps)
     MP> comm (ZNC) iot (Grafana, InfluxDB for data collection from two
     MP> smart meters every 10 seconds) mail (Postfix/Cyrus IMAP for a
     MP> handful of mailboxes)

     MP> Most of the time, the Hosts CPU usage of the host with this
     MP> "load" is around 20%. The provided services consistently respond
     MP> quickly.

Ok - and these are accounted as the container qemu processes' quota
scheduling time, I assume ? What about RAM ? Have you had a situation
where the host OS has to swap out ? Does this cause trouble ? Or does
qemu/nvmm only use pinned memory ?

I configured the VMs' RAM to have a few hundred MB buffer left on thehost. Memory has run out in the past, especially when zfs send makes useof the buffer cache. Then swapping also occurred and together with thei/o load already increased by zfs send, the system was slowed down sobadly that the response times were no longer acceptable. A completerecovery brought in this state also only a restart of the host. I gotthis under control with a tip someone gave me in #netbsd - I now pipethe output of zfs send first into dd, which has set the oflag "direct"and takes over the writing of the file. Obviously this bypasses some ofthe caching and avoids this situation.

Regarding pinned memory I can't say anything - the memory consumption ofthe VMs is stable from the host point of view, ballooning I haven'treally tried with it yet.


     MP> However, I have noticed that depending on the load, the clocks
     MP> of the VMs can deviate significantly. This can be compensated
     MP> for by using a higher HZ in the host kernel (HZ=1000) and
     MP> tolerant ntdps configuration in the guests. I have also tried
     MP> various settings with schedctl, especially with the FIFO
     MP> scheduler, which helped in certain scenarios with high I/O
     MP> load. However, this came at the expense of stability.

I assume this is only *within* your VMs, right ? Do you see this across
guest Operating Systems, or just specific ones ?

The deviation of the time is caused by missed interrupts of the guests.As I said, there are a number of workarounds for this and a number ofvery good explanations in this thread:


	https://mail-index.netbsd.org/netbsd-users/2022/08/31/msg028894.html

I do not use operating systems other than NetBSD as guests in thissetup. As a test, I also had various Linux distributions running undernvmm. I didn't do the tests in depth, but I had a test VM with AlpineLinux running for a while and had the impression that this ran as wellas NetBSD.


     MP> Furthermore, in my system configuration, granting a guest more
     MP> than one CPU core does not seem to provide any
     MP> advantage. Particularly in the VMs where I am concerned about
     MP> performance (net with Samba/NFS), my impression is that
     MP> allocating more CPU cores actually decreases performance even
     MP> further. I should measure this more precisely someday...

ic - this is interesting - are you able to run some tests to nail this
down more precisely ?

I should definitely do that and if you have a specific idea of what Ishould try once, feel free to let me know. I think that the observationsfrom back then should also be seen in the context of my concrete system.Since I have only two CPU cores available, virtually with one Qemuprocess and one Qemu IO thread running outside the Qemu process, bothcores are already fully occupied under full I/O load of one VM.Therefore, it seems to me in this setup not so improbable that whenadding another Qemu process (for the 2nd CPU of the VM) then resourcesbecome rare.


     MP> If you have specific questions or need assistance, feel free to
     MP> reach out. I have documented everything quite well, as I
     MP> intended to contribute it to the wiki someday. By the way, I am
     MP> currently working on a second identical system where I plan to
     MP> test the combination of NetBSD 10.0_BETA and Xen 4.15.

There's quite a bit of goodies wrt Xen in 10.0 - mainly you can now run
accelerated as a Xen guest (hvm with the PV drivers active).

For now I only use the "conventional" PV for my guest systems. But Ialso have a pure NetBSD setup here at the moment. I'm curious about thecomparison myself. Currently I have measured about 5 times the bandwidthwith Xen on the identical hardware with Samba from a VM whentransferring a large file minus all caching effects. This is my focus atthe moment, because on the Xen system I use VNDs on FFS, while on theQemu/nvmm ZVOLS are in use. There are too many variables in the equationat the moment ;-)


Kind regards
Matthias

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

References:
- Re: nvmm users - experience
  - From: Robert Nestor
- Re: nvmm users - experience
  - From: Mathew, Cherry G.*

Prev by Date: Re: nvmm users - experience
Next by Date: Re: nvmm users - experience
Previous by Thread: Re: nvmm users - experience
Next by Thread: Meaning of file flags
Indexes:

Home | Main Index | Thread Index | Old Index