tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen 3.3: Problem HVM guest



On Wednesday 13 August 2008 13:49:26 Manuel Bouyer wrote:
> On Wed, Aug 13, 2008 at 09:57:58AM +0200, Christoph Egger wrote:
> > On Tuesday 12 August 2008 23:54:20 Christoph Egger wrote:
> > > Christoph Egger wrote:
> > > > Hi,
> > > >
> > > > When launching a HVM guest, the process hangs and does not even block
> > > > any xmlrpc communication, there's a significant system slowdown until
> > > > reboot.
> > > >
> > > > I finally indentified the place of hang. It is in the xentools33
> > > > package in ${WRKSRC}/libxc/xc_hvm_build.c , function setup_guest().
> > > >
> > > > At the end of setup_guest(), there's this junk of code:
> > > >
> > > >
> > > >
> > > >     /* Insert JMP <rel32> instruction at address 0x0 to reach entry
> > > > point. */
> > > >     entry_eip = elf_uval(&elf, elf.ehdr, e_entry);
> > > >     if ( entry_eip != 0 )    {
> > > >         char *page0 = xc_map_foreign_range(
> > > >             xc_handle, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
> > > >         if ( page0 == NULL )
> > > >             goto error_out;
> > > >         page0[0] = 0xe9;               <------------ "hang"
> > > >         *(uint32_t *)&page0[1] = entry_eip - 5;
> > > >         munmap(page0, PAGE_SIZE);
> > > >     }
> > > >
> > > >
> > > > The "hang" happens when executing page0[0] = 0xe9;
> > > >
> > > > I'm CC this to tech-kern, because I'm not sure if this is a bug in
> > > > xentools or if I found a UVM/pmap bug.
> > >
> > > Here is an URL to the diff where this hunk has been added including the
> > > commit log:
> > >
> > > http://xenbits.xensource.com/xen-unstable.hg/rev/772674585a1a
> >
> > Undoing this c/s 15985 lets the HVM guest launch, but I don't know if
> > this is the right way.
> >
> > Manuell: Can you help me verifying if IOCTL_PRIVCMD_MMAP in
> > sys/arch/xen/xen/privcmd.c is implemented correctly ?
>
> Did you try ktracing the process when is starts looping ?

   400      8 python2.4 CALL  mlock(0x7f7ff73fb000,0x1000)
   400      8 python2.4 RET   mlock 0
   400      8 python2.4 CALL  ioctl(8,_IOWR('P',0,0x38),0x7f7ff73fb760)
   400      8 python2.4 GIO   fd 8 wrote 56 bytes
       "\"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\M-7?\M-w\^?\^?\0\0\M^?
\M-g\^C\0\0\0\0\0\^A\0\0\
        \0\0\0\0\0001S\240\M-{\^?\^?\0\0\^A\0\0\0\^A\0\0\0"
   400      8 python2.4 GIO   fd 8 read 56 bytes
       "\"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\M-7?\M-w\^?\^?\0\0\M^?
\M-g\^C\0\0\0\0\0\^A\0\0\
        \0\0\0\0\0001S\240\M-{\^?\^?\0\0\0\0\0\0\0\0\0\0"
   400      8 python2.4 RET   ioctl 0
   400      8 python2.4 CALL  munlock(0x7f7ff73fb000,0x1000)
   400      8 python2.4 RET   munlock 0
   400      8 python2.4 CALL  mmap(0,0x1000,3,0x1001,0xffffffff,0,0)
   400      8 python2.4 RET   mmap 140187698843648/0x7f7ffdfdd000
   400      8 python2.4 CALL  ioctl(8,_IOW('P',0x2,0x10),0x7f7ff73fb790)
   400      8 python2.4 GIO   fd 8 wrote 16 bytes
       "\^A\0\0\0\^A\0\0\0p\M-7?\M-w\^?\^?\0\0"
   400      8 python2.4 RET   ioctl 0
   400      3 python2.4 RET   select 0
   400      3 python2.4 CALL  _lwp_park(0,0,0x7f7ffd104e80,0x7f7ffd104e80)

> I guess it's looping on page fault. To check for what's going on I would:
> - check that privcmd_map_obj() gets the protection right for this address
>   (it should be VM_PROT_READ | VM_PROT_WRITE, as we're remapping a range
>    which was mmapped read/write)

Yes, protection is VM_PROT_READ | VM_PROT_WRITE.

> - instrument privpgop_fault() to see if it gets called at all for this
>   mapping, and if it's doing the right thing.
>   There should be only one page in this object, and the machine address
>   should be 0 (pobj->maddr[maddr_i])

Yes, privpgop_fault() is called. It looks like it's called in a loop.
npages = 1 and machine address is 0.

> - if privpgop_fault() behaves properly, check that the xpq_update_foreign()
>    call in pmap_enter_ma() works as intended.

It is not failing at least.

> But I'm wondering ig calling IOCTL_PRIVCMD_MMAP with a machine address
> of 0 is really what's intended here. The physical address of the
> page in the domU is 0, but I doubt the machine address is really 0.
> For example in libxc/xc_dom_boot.c, xc_dom_p2m_host() is called to
> convert the physical page number in the new domain to machine
> frame number before calling xc_map_foreign_ranges().
> On the other hand, there's also xc_hvm_build.c which calls
> xc_map_foreign_range() whith HVM_E820_PAGE, which is a constant.
> So I'm not sure what's right here. Is xpq_update_foreign() behaving
> differently for HVM and PV guests ?




Home | Main Index | Thread Index | Old Index