Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: init receiving SIGILL on XEN3_DOM0/amd64



On Tuesday 27 May 2008 14:28:05 Andrew Doran wrote:
> On Tue, May 27, 2008 at 01:31:21PM +0200, Christoph Egger wrote:
> > On Monday 26 May 2008 22:57:06 Christoph Egger wrote:
> > > Manuel Bouyer wrote:
> > > > On Sun, May 18, 2008 at 02:32:37PM +0200, Manuel Bouyer wrote:
> > > >> On Sun, May 18, 2008 at 01:43:04PM +0200, Christoph Egger wrote:
> > > >>>> When the CPU sends an illegal intruction trap, I guess.
> > > >>>
> > > >>> In that case, I have a wild guess:
> > > >>> sysenter is an illegal opcode on amd64. The difference is, AMD CPUs
> > > >>> throw an exception, Intel CPUs don't.
> > > >>> Is sysenter used somewhere?
> > > >>
> > > >> I don't think it should be for amd64 binaries.
> > > >> i386 binaries may, but only if I686_LIBC is defined. Othewrise int
> > > >> $0x80 is used.
> > > >
> > > > OK, I've been seeing this occasionally now on an Intel CPU. A dom0
> > > > kernel will fail to start init in this way once in a while. But
> > > > hitting ^A^A^AR on the serial console cause the system to reboot, and
> > > > the next boot is usually successfull without changing anything in
> > > > kernel or boot options.
> > > >
> > > > The intermittent aspect makes it hard to debug, and also confuses me
> > > > a bit about what the cause could be.
> > >
> > > I see this, too on AMD CPU. I tracked this down to when start_init()
> > > returns to lwp_trampoline which in turn returns to Xosyscall.
> > > Xosyscall wants to send SIG 4 to pid 1 (no idea why)
> > > and calls trap for this.
> > > In trap a page fault happens which in turn leads into an
> > > endless loop of traps into ddb.
> >
> > start_init() returns to lwp_trampoline which in turn jmp into Xosyscall.
> >
> > Xosyscall notices an AST is pending and calls trap. trap sends SIG 4
> > to pid 1 (init).
> >
> > Hope, that helps someone understanding the code better than I do
> > to fix this.
>
> Does this patch have any impact?
>
>       http://www.netbsd.org/~ad/vm_machdep.c.diff

Yes it does. It fixes the problem for me.
i386 also needs this fix as some people reported the same
issue on i386.

Thank you very much for fixing this.

Christoph


Home | Main Index | Thread Index | Old Index