Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen 3.3: Problem HVM guest



On Thursday 14 August 2008 23:39:23 Christoph Egger wrote:
> Manuel Bouyer wrote:
> > On Thu, Aug 14, 2008 at 08:25:14PM +0200, Christoph Egger wrote:
> >>> Not really, as the write which is failing is also in dom0 (so on the
> >>> same CPU). I think the tlb should be properly invalidated. Just to make
> >>> sure you can try adding
> >>> pmap_tlb_shootdown(pmap, va, 0, opte);
> >>> just after xpq_update_foreign() in pmap_enter_ma(). But as we're
> >>> switching pmaps on return to userland, this shouldn't be needed.
> >>
> >> This has no impact.
> >
> > As expected ... I'm running out of idea. I'll try to reproduce this
> > on my test box, but it won't be before next week.
>
> I found the bug:
>  >>>>> - instrument privpgop_fault() to see if it gets called at all for
>  >>>>>   this mapping, and if it's doing the right thing.
>  >>>>>   There should be only one page in this object, and the machine
>  >>>>>   address should be 0 (pobj->maddr[maddr_i])
>  >>>>
>  >>>> Yes, privpgop_fault() is called. It looks like it's called in a
>  >>>> loop. npages = 1 and machine address is 0.
>  >>>
>  >>> OK, it has the right data. I guess it's called in a loop because
>  >>> writing at the page keeps failing.
>
> Writing at the page keeps failing because privpgop_fault()
> does not handle this case:
>
>           if (pobj->maddr[maddr_i] == 0)
>                continue; /* this has already been flagged as error */
>
> Removing this makes privpgop_fault() calling pmap_enter_ma()
> and that makes the write access finally succeed and the HVM guest
> starts.
>
> May I commit this change?


The story is not over yet. When running a HVM guest, the machine
suddenly freezes with this message:

Mutex error: mutex_spin_retry: locking against myself

lock address : 0xffffffff80b86a80
current cpu  :                  0
current lwp  : 0xffffa000257e47e0
owner field  : 0x0000000000010700 wait/spin:                0/1

The machine freezes absolutely: No keyboard interrupt, no serial console
and no network is working. The machine can't be pinged from outside.


What I figured out so far:

a) I can only reproduce this with / on nfs. (So is this NetBSD/Xen specific? )
b) The values are always the same.

Can anyone help me out here in tracking this down, please ?


Christoph


Home | Main Index | Thread Index | Old Index