Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD & Xen 4 - where are we ?

On Fri, 25 May 2012 10:05:31 +0200, Manuel Bouyer wrote:
The recent big project that completed is domU SMP. There is suspend/resume which is being worked on, but, as I understand it, is not completely stable
yet (jym@ can give more details on this).

Here we go. Copy/pasting a mail I just wrote, so everyone can stay informed:

Three main tasks remains:

- 90%: debugging. The rest (drivers, pmap suspend, xen routines...) are done and committed in -6 and -current.

- 5%: coding secondary CPUs suspend as we got MP support in Xen. I did not write it as I am focused on a tedious bug hunt currently (see below). The code for that is not really difficult, it merely boils down to putting a CPU into cpu_idle() and have a routine to re-configure upon resume (should be fairly close to the trampolines we have for ACPI resuming).

- 5%: finding a way to deplete per-CPU pool_cache(9) and wait for the operation to complete safely. As pool_cache(9) have no interlock for per-CPU structures, the only way to do that would be through cross-calls (implementing some form of interlocking was rightfully NACKed for performance reasons). Unfortunately, xc_wait() cannot be used from interrupt context, which makes the use of pool_cache_invalidate() impossible. Modifying pool_cache_invalidate() is also KO, as it is used by interrupt code. I am about to post a mail to tech-kern@ for this, I encounter this situation quite often in few places (kauth(9)), and am without a solution ATM.

The debugging part is what I am currently struggling with, for quite a few weeks; the hypervisor hampers domU resuming because it detects bits that are set in the VM mappings (90% of the time: PG_G), but they are never managed nor enabled by NetBSD Xen (as it should be). *sigh*

I am currently failing at identifying where this does come from. It started to appear when our x86 pmap dumped "alternative recursive mappings" for "ephemeral mappings + pmap reference counters" (I can send you the commit that enabled this, alongside a small diff to see what part of the pmap is affected). What bugs me is that this patch should not affect port-xen VM by much, it even simplified suspend/resume greatly.


Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index