Subject: Re: panic: kernel fault
To: Hubert Feyrer <hubert.feyrer@rz.uni-regensburg.de>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-sparc
Date: 04/03/1999 19:36:35
On Sat, Apr 03, 1999 at 01:31:51AM +0200, Hubert Feyrer wrote:
> On Fri, 2 Apr 1999, Manuel Bouyer wrote:
> > Hum, can you compile a kernel with '-g', so that we know where it is failing
> > exactly, and on which value ?
> 
> OK, I'm running a (stripped version of a) kernel with debugging symbols
> now, and here's some more data:
> 
> panic: kernel fault
> #0  mi_switch () at ../../../../kern/kern_synch.c:632
> 632             cpu_switch(p);
> (gdb) bt
> #0  mi_switch () at ../../../../kern/kern_synch.c:632
> #1  0xf002e264 in tsleep (ident=0x0, priority=4, wmesg=0xf00d2a80 "scheduler", 
>     timo=0) at ../../../../kern/kern_synch.c:370
> #2  0xf00d2b40 in uvm_scheduler () at ../../../../uvm/uvm_glue.c:436
> #3  0xf001fcf8 in main () at ../../../../kern/init_main.c:422
> 
> (The kernel was generated after a "make clean"... I have no clue why not
> all arguments are shown here ... the gdb is still from 1.3 . Update is
> delayed by frequent crashes :|)

My IPC panics the same way. I'm not sure but it may be related to paging.
I've been unable to link a -g compiled netbsd with such a kernel, i've
got to boot an older one. With -g, the final ld process may be bigger than my
physical RAM (20Mb). Can't run gdb on netbsd.gdb either (netbsd.gdb is 15MB).
I get a stack trace similar to yours. cpu_switch() is an assemby routine from
locore.s, p is a valid pointer. I can't see a missing argument here:
uvm_scheduler() and main() don't have arguments.

> 
> 
> > Hum, alpha, i386, and now sparc ports are broken. Would be nice if all had the
> > same cause :)
> 
> Hum, I'm also running 1.4-current (kernel) on i386 here, and it seems
> stable to me (though i'm not compiling much, it's mainly an X terminal).

I've seen panics in the pmap module, this started at about the same time
as problems on alpha. I can recreate this reliably on my home PC (
'make -j4 clean' in an already clean source tree), but not on
another machine. On port-i386 only one other person has seen such panics.

On port-alpha Matthew Jacob narrowed down the changes which causes problems
to something between 99.03.26.04.00.00 to 99.03.26.08.00.00. I'll grab
a src/sys of both sources date and see how it goes on my i386 and sparc.
However I will not be able to look at this tomorow I fear.

--
Manuel Bouyer <bouyer@antioche.eu.org>
--