[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Migration - Xen 4.3
Le 19/01/2015 13:09, Stephen Borrill a écrit :
On Sun, 18 Jan 2015, Brian Marcotte wrote:
Has anyone tried Migration recently?
I'm getting this on Xen 4.3, Linux dom0:
# xl migrate mydomU otherdom0host
libxl: error: libxl_dom.c:1063:libxl__domain_suspend_common_callback:
guest didn't acknowledge suspend, cancelling request
Curious, the ack is done when HYPERVISOR_suspend() gets called by domY
(see link just below).
On the domU (netbsd) console, I see this:
xenbus_shutdown_handler: xenbus_rm 13
Flushing disk caches: 21 done
The domain is completely locked up and doesn't even respond to
Before xenbus handlers get called, do you see any of the DPRINTK(...)
messages found on this page:
It is similar with "xm migrate".
migration is merely a suspend => snapshot (done by hypervisor/xl/xend)
=> resume on the other. So if any of "suspend" or "resume" does not
work, so does migrate.
I have never got migration to succeed, nor suspend/resume (which is
fundamentally the same operation). Sometimes suspend appears to work,
but resume fails.
The usual symptoms for me are that the domain just continues
regardless (i.e. it ignores the suspend request) rather than it locks.
However, from a dom0 point of view, it waits for confirmation of
suspension and so blocks any further operations on it.
I understand that the problem is reasonably well understood by the
relevant developers, but that debugging is hard.
Now that's a regression, suspend is supposed to finish correctly (e.g.
the domain is descheduled and you get a core file that represents VM's
state). Resuming is the culprit and I failed to get $SPARETIME to
investigate the situation in the last few months; usual symptoms is a
corrupted Xen I/O ring that completely thrashes xbdback(4) and leads to
a paniced domU (or a very, very slow dom0 like an interrupt storm).
Do you try suspend/resume on UP or MP domU? Using xl or xm?
Main Index |
Thread Index |