Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Dom0 ballooning: crash on guest start

On 01.05.2011 04:19, Jean-Yves Migeon wrote:
> On 29.04.2011 15:02, Christoph Egger wrote:
>>>> balloon0: inflate 512 => inflated by 512
>>>> uvm_fault(0xffffffff80c6b220, 0xffffffff81400000, 1) -> e
>>>> fatal page fault in supervisor mode
>>>> trap type 6 code 0 rip ffffffff8054593d cs e030 rflags 10216 cr2
>>> Okay, happens when balloon(4) has already been inflated by a good share.
>>> Out of interest, having XENDEBUG_SYNC enabled in
>>> arch/xen/x86/x86_xpmap.c does not change anything to the result?
> I can reliably trigger the issue. It's due to the domain doing P2M page
> frame number translations via the xpmap_phys_to_machine_mapping array
> for specific PFNs in the 3GiB range.
> For amd64, index are from 0xbd000 to 0xbd200 (one page, mapping physical
> pages 3024 => 3026MiB).
> Now, why there is this 2MiB hole right there in the pseudo-physical
> map... is the next question. They have no direct connection to machine
> (real) addresses, and there's nothing like this appearing in the
> dom0/hypervisor domain_build code (xen/arch/x86/domain_build.c).

I am more and more convinced that the issue lies in the early stage of
boot: xpmap_phys_to_machine_mapping (the "P2M array") is first populated
by hypervisor when launching dom0, then used/updated by domain as
necessary, without requiring hypervisor's help.

When tracking the content of the P2M array during start, the
aforementioned addresses (0xffffffff81400000 here) have correct values
up to entering init_x86_64(). Then, upon leaving
pmap_prealloc_lowmem_ptps(), its content starts being suspicious
(entries are 0 and not ~0 like INVALID_P2M_ENTRY typically is), then,
right after the pmap_growkernel() call, reading the address will fire
unrecoverable page faults.

The hole is just one page (4k) long. "Manually" forcing the code to jump
above it makes balloon(4) happy again.

IMHO, all the black magic (nefarious? :o) code that involves memory 1:1
mappings and kernel relocation that happen at this early stage has
probably a bug hiding somewhere, and incorrectly map certain
xpmap_phys_to_machine_mapping pages, leading ultimately to a fault.
The error has probably been in there for a very long time, as you
can only trigger it when you start looking for "high" order PFNs (above
3GiB). As it is quite uncommon to have 3GiB+ allocations in a domain,
only ballooning by a good share may trigger it. I can't manage to
reproduce this under i386 PAE.

FWIW, I can also trigger the fault with very large allocations that
force zeroing pages, dd(1) being the de facto standard for this:

# after a while, a panic() happens at a fault address covered by
# xpmap_phys_to_machine_mapping
dd if=/dev/zero of=/dev/null bs=4g count=1

I'll have to put this study on hold though: jym-xensuspend really needs
urgent fixes now.

Jean-Yves Migeon

Home | Main Index | Thread Index | Old Index