Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Dom0 PAE panic when starting xend



On Wednesday 18 February 2009 14:57:49 Christoph Egger wrote:
> Hi,
>
>
> I can boot i386 Dom0 PAE with Xen 3.3.1.
> When I launch xend, I get a panic:
>
> # xend start
> Feb 18 13:22:59 fricka xenstored: Checking store ...
> Feb 18 13:23:02 fricka xenstored: Checking store complete.
> (XEN) mm.c:1777:d0 Error pfn 55555: rd=ff2b8100, od=00000000, caf=00000000,
> taf=0000000
> 0
> (XEN) mm.c:708:d0 Error getting mfn 55555 (pfn 55555555) from L1 entry
> 0000000055555067 for dom0
> xpq_flush_queue: 1 entries
> 0x0000000102fd1608: 0x0000000055555067
> panic: HYPERVISOR_mmu_update failed
>
> fatal breakpoint trap in supervisor mode
> trap type 1 code 0 eip c02125c4 cs 9 eflags 246 cr2 bb6c1800 ilevel 6
> Stopped in pid 415.1 (xenstored) at     netbsd:breakpoint+0x4:  popl   
> %ebp db> bt
> breakpoint(c09987fe,cdfe9988,c09b7080,c05fce0b,c099f49d,5,0,0,cdfe9998,ffff
>ffea) at netbsd:breakpoint+0x4
> panic(c099f4b3,2fd1608,1,55555067,0,cdfe99ac,0,c07619c6,cabfd484,c0a9105c)
> at netbsd:panic+0x1a4
> xpq_update_foreign(2fd1608,1,55555067,0,cdb6bce0,0,cdfe9a1c,c05f5b0c,cdb489
>10,cd b6bce0) at netbsd:xpq_update_foreign
> pmap_enter_ma(cda80934,bb6c1000,55555000,0,55555000,0,3,23,7ff0,3) at
> netbsd:pmap_enter_ma+0x580
> pmap_enter(cda80934,bb6c1000,55555000,0,3,23,ce0f7f50,0,c16fa640,bb6c1000)
> at netbsd:pmap_enter+0xd3
> udv_fault(cdfe9c70,bb6c1000,cdfe9c30,1,0,1,5,ffffffff,c06fb56b,0) at
> netbsd:udv_fault+0x491
> uvm_fault_internal(ca89f4e0,bb6c1000,1,0,0,0,0,cdf9aa20,0,c09f7ff0) at
> netbsd:uvm_fault_internal+0x8e5
> trap() at netbsd:trap+0x6e0
> --- trap (number 6) ---
> 0x804d53a:
> db>
>
>
> In Xen, said function in mm.c:1777 is this:
>
> int get_page(struct page_info *page, struct domain *domain)
> {
>     u32 x, nx, y = page->count_info;
>     u32 d, nd = page->u.inuse._domain;
>     u32 _domain = pickle_domptr(domain);
>
>     do {
>         x  = y;
>         nx = x + 1;
>         d  = nd;
>         if ( unlikely((x & PGC_count_mask) == 0) ||  /* Not allocated? */
>              unlikely((nx & PGC_count_mask) == 0) || /* Count overflow? */
>              unlikely(d != _domain) )                /* Wrong owner? */
>         {
>             if ( !_shadow_mode_refcounts(domain) && !domain->is_dying )
>                 gdprintk(XENLOG_INFO,
>                          "Error pfn %lx: rd=%p, od=%p, caf=%08x, taf=%"
>                          PRtype_info "\n",
>                          page_to_mfn(page), domain, unpickle_domptr(d),
>                          x, page->u.inuse.type_info);
>             return 0;
>         }
>         asm volatile (
>             LOCK_PREFIX "cmpxchg8b %2"
>
>             : "=d" (nd), "=a" (y),
>
>             "=m" (*(volatile u64 *)(&page->count_info))
>
>             : "0" (d), "1" (x), "c" (d), "b" (nx) );
>
>     }
>     while ( unlikely(nd != d) || unlikely(y != x) );
>
>     return 1;
> }
>
> I added additional debug output to see why get_page()
> returns 0:
>
> (XEN) get_page: (x & PGC_count_mask) = 0
> (XEN) get_page: (nx & PGC_count_mask) = 1
> (XEN) get_page: wrong owner
>
> So the accessed page is a) allocated, b) overlows and c) doesn't belong to
> Dom0.
>
> I added a BUG();  right before 'return 0;' to get a backstrace:
>
> (XEN) Xen call trace:
> (XEN)    [<ff13d169>] get_page+0x11e/0x15a
> (XEN)    [<ff13b5d2>] get_page_from_l1e+0x284/0x43f
> (XEN)    [<ff13c98a>] mod_l1_entry+0x3c5/0x4a3
> (XEN)    [<ff13eb74>] do_mmu_update+0x44d/0x76a
> (XEN)    [<ff1a58a8>] hypercall+0xb8/0xd8
>
> I'm not sure, if I hit a bug in Xen or in NetBSD/Xen.


I figured out, it is xenstored who triggers the issue.
Starting xenstored manually triggers it. 
Looking into the source, I found a bunch of undocumented
options. xenstored -D  skips some domain initialization code
and this does NOT trigger the issue. Interesting...

Christoph


Home | Main Index | Thread Index | Old Index