Re: Xen 3.3: Problem HVM guest

To: Christoph Egger <Christoph_Egger%gmx.de@localhost>
Subject: Re: Xen 3.3: Problem HVM guest
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Date: Thu, 14 Aug 2008 18:34:42 +0200

On Thu, Aug 14, 2008 at 06:27:05PM +0200, Christoph Egger wrote:
> On Thursday 14 August 2008 16:37:53 Manuel Bouyer wrote:
> > On Thu, Aug 14, 2008 at 04:28:03PM +0200, Christoph Egger wrote:
> > > On Thursday 14 August 2008 15:52:24 Manuel Bouyer wrote:
> > > > On Thu, Aug 14, 2008 at 01:59:35PM +0200, Christoph Egger wrote:
> > > > > > Did you try ktracing the process when is starts looping ?
> > > > >
> > > > >    400      8 python2.4 CALL  mlock(0x7f7ff73fb000,0x1000)
> > > > >    400      8 python2.4 RET   mlock 0
> > > > >    400      8 python2.4 CALL 
> > > > > ioctl(8,_IOWR('P',0,0x38),0x7f7ff73fb760) 400      8 python2.4 GIO  
> > > > > fd 8 wrote 56 bytes
> > > > >        "\"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\M-7?\M-w\^?\^?\0\0\M^?
> > > > > \M-g\^C\0\0\0\0\0\^A\0\0\
> > > > >         \0\0\0\0\0001S\240\M-{\^?\^?\0\0\^A\0\0\0\^A\0\0\0"
> > > > >    400      8 python2.4 GIO   fd 8 read 56 bytes
> > > > >        "\"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\M-7?\M-w\^?\^?\0\0\M^?
> > > > > \M-g\^C\0\0\0\0\0\^A\0\0\
> > > > >         \0\0\0\0\0001S\240\M-{\^?\^?\0\0\0\0\0\0\0\0\0\0"
> > > > >    400      8 python2.4 RET   ioctl 0
> > > > >    400      8 python2.4 CALL  munlock(0x7f7ff73fb000,0x1000)
> > > > >    400      8 python2.4 RET   munlock 0
> > > > >    400      8 python2.4 CALL  mmap(0,0x1000,3,0x1001,0xffffffff,0,0)
> > > > >    400      8 python2.4 RET   mmap 140187698843648/0x7f7ffdfdd000
> > > > >    400      8 python2.4 CALL 
> > > > > ioctl(8,_IOW('P',0x2,0x10),0x7f7ff73fb790) 400      8 python2.4 GIO  
> > > > > fd 8 wrote 16 bytes
> > > > >        "\^A\0\0\0\^A\0\0\0p\M-7?\M-w\^?\^?\0\0"
> > > > >    400      8 python2.4 RET   ioctl 0
> > > > >    400      3 python2.4 RET   select 0
> > > > >    400      3 python2.4 CALL
> > > > > _lwp_park(0,0,0x7f7ffd104e80,0x7f7ffd104e80)
> > > > >
> > > > > > I guess it's looping on page fault. To check for what's going on I
> > > > > > would: - check that privcmd_map_obj() gets the protection right for
> > > > > > this address (it should be VM_PROT_READ | VM_PROT_WRITE, as we're
> > > > > > remapping a range which was mmapped read/write)
> > > > >
> > > > > Yes, protection is VM_PROT_READ | VM_PROT_WRITE.
> > > >
> > > > Looks good
> > > >
> > > > > > - instrument privpgop_fault() to see if it gets called at all for
> > > > > > this mapping, and if it's doing the right thing.
> > > > > >   There should be only one page in this object, and the machine
> > > > > > address should be 0 (pobj->maddr[maddr_i])
> > > > >
> > > > > Yes, privpgop_fault() is called. It looks like it's called in a loop.
> > > > > npages = 1 and machine address is 0.
> > > >
> > > > OK, it has the right data. I guess it's called in a loop because
> > > > writing at the page keeps failing.
> > > >
> > > > > > - if privpgop_fault() behaves properly, check that the
> > > > > > xpq_update_foreign() call in pmap_enter_ma() works as intended.
> > > > >
> > > > > It is not failing at least.
> > > >
> > > > Could you check the value of the pte (*ptep) after the call to
> > > > xpq_update_foreign() ?
> > > > Also, I guess we should be watching the value of 'ok' in
> > > > xpq_update_foreign(), which we're not doing right now ...
> > >
> > > [...]
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0xda8b9167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0xdee10167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0x9c1fb167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0x9c1fc167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0x9c1fd167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0x9c1fe167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0x9c1ff167
> > > xpq_update_foreign: ok: 1
> > > pmap_enter_ma: *ptep: 0x9c1ff167
> > > "hang"
> >
> > Also, did you print *ptep as %x or %lx ? It should be a 64bit value on
> > amd64 ...
> 
> %lx
> 
> > But anyway it looks like a valid PTE entry. I wonder if we're updating
> > the right PTE. Could you also check if the virtual address
> > passed to pmap_enter_ma() is the same as the pointer used in the hanging
> > program ? Also check the domid, while at it ...
> 
> Virtual address used in the hanging program: page0: 0x7f7ffdfdd000
> 
> xpq_update_foreign: ok: 1
> pmap_enter_ma: va: 0x7f7ffdfdd000, *ptep: 0x9c1ff167, domid: 1
> 
> va matches and domid is also right.

OK; just to make sure: the program hangs trying to write page0, right 
(The instruction after the write is never executed) ?
Also could you print *ptep as a 64bit value ?

-- 
Manuel Bouyer, LIP6, Universite Paris VI.           
Manuel.Bouyer%lip6.fr@localhost
     NetBSD: 26 ans d'experience feront toujours la difference
--

Follow-Ups:
- Re: Xen 3.3: Problem HVM guest
  - From: Christoph Egger

References:
- Xen 3.3: Problem HVM guest
  - From: Christoph Egger
- Re: Xen 3.3: Problem HVM guest
  - From: Christoph Egger
- Re: Xen 3.3: Problem HVM guest
  - From: Manuel Bouyer
- Re: Xen 3.3: Problem HVM guest
  - From: Christoph Egger

Prev by Date: Re: Xen 3.3: Problem HVM guest
Next by Date: Re: Xen 3.3: Problem HVM guest
Previous by Thread: Re: Xen 3.3: Problem HVM guest
Next by Thread: Re: Xen 3.3: Problem HVM guest
Indexes:

Home | Main Index | Thread Index | Old Index