Subject: Re: HEAD instability on Xen
To: Antti Kantee <pooka@cs.hut.fi>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-i386
Date: 11/18/2007 21:45:29
On Sun, Nov 18, 2007 at 10:39:34PM +0200, Antti Kantee wrote:
> On Sun Nov 18 2007 at 21:19:15 +0100, Manuel Bouyer wrote:
> > Hi,
> > I've tested a kernel built from HEAD (which I didn't do for some time) and
> > seeing panics when starting xend (this dom0 has only 64Mb RAM allocated).
> > It's always a uvm_fault() on a kernel address, but the fault instruction
> > varies. Here's a sample of a panic:
> > 
> > Starting xend.
> > uvm_fault(0xc09605c0, 0xc8028000, 1) -> 0xe
> > fatal page fault in supervisor mode
> > trap type 6 code 0 eip c03c8bdd cs 9 eflags 10246 cr2 0 ilevel 0
> > kernel: supervisor trap page fault, code=0
> > Stopped in pid 169.1 (python2.4) at     netbsd:uvm_map_lookup_entry+0x4d:       cmpl     %edi,0x20(%ebx)
> > db> tr
> > uvm_map_lookup_entry(c5db0ac4,8063000,c7fa6690,1,0) at netbsd:uvm_map_lookup_entry+0x4d
> > uvm_fault_internal(c5db0ac4,8063000,2,0,c04db079) at netbsd:uvm_fault_internal+0xdd
> > trap() at netbsd:trap+0x415
> 
> Looks awfully lot like corrupted vm_maps for the newly created process.
> But I don't know why this would be related to xen and specifically
> starting xend.

I think it's because at this point of the boot process, xend is the largest
process, and memory is getting low.

> 
> What do you mean that the fault instruction varies?  Is it still always
> trying to access a vm_map_entry?  I'm guessing the above example was
> from line 1556 in uvm_map.c.

I've also seen it when taking a lock, but I can't remember if the lock was in
a vm_map_entry. I've also seen it in the pool code. It was always handling a
trap after a copyin or copyout though.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--