Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[suspend/resume] memory_op hypercall failure: XENMEM_maximum_gpfn



Hi!

This mail originated from the latest dom0/domU issue fixed by Manuel, relating to the cpu_idle() set to null (which caused an early panic).

I wanted to give a try to the "xm dump-core" functionality provided by Xen. Basically, this options pauses a domain, take a snapshot of the kernel and memory state, and unpause it. The obtained image is stored in a file, like a traditional core dump.

Things are not that easy though. Calls to xm dump-core fail with:

# xm dump-core 2 /root/core
Error: Failed to dump core: (1, 'Internal error', 'p2m_size < nr_pages -1 (0 < 1fff')

I tracked down the error. Basically, the p2m_size table (proportional to the size of memory allocated to a guest) is not reported as 0, whatever this error says. The privcmd ioctl does perform the hypercall to abtain such information, but returns -1. The code used to dump the guest, in xentools3 pkg (tools/libxc/xc_core_x86.c) just increments the return value by 1, without checking for errors:

static int nr_gpfns(int xc_handle, domid_t domid) {
   return xc_memory_op(xc_handle, XENMEM_maximum_gpfn, &domid) + 1;
}

Turning the error into a 0 sized p2m table.

Observations are the same in privcmd.c (xen/xen/privcmd.c): hypercall does return -1.

Questions are:
- hypercall APIs speak of "-ve errcode" on failure, but I can not manage to find which errcode they are referring to. Are they the same as the ones given in the mini-os from xentools? (extras/mini-os/include/errno-base.h). If yes, -1 indicates an EPERM error, which is weird for dom0. - do the XENMEM_maximum_gpfn memory operation require some cooperation from the guest to obtain the proper value? I would say no, hypervisor should do it by itself alone. Am I missing something here?

While inspecting privcmd interface, I have found something weird too: xend is periodically polling hypervisor with a XEN_DOMCTL_getdomaininfo operation for a domain ID of value "latest_domain_id_started + 1". Of course, it returns an error, -3 (ESRCH, no such process, according to mini-os errno files). Is this normal behavior from xend? Some kind of regular polling to check that no other unmonitored domain has been created without xend supervision?

Lastly, all my debugging is done through a couple of printf()'s. Indeed, it works, but it is rather unpractical. Any advice for a more efficient method is gladly appreciated.

Note that save/restore operations dump the state of the kernel to a file too, so XENMEM_maximum_gpfn is necessary for suspend.

Thanks for your attention :)

Cheers,

--
Jean-Yves Migeon
jean-yves.migeon%espci.fr@localhost




Home | Main Index | Thread Index | Old Index