Subject: Re: kernel panic on or after bootup
To: David Lord <david@lordynet.org>
From: Pavel Cahyna <pavel.cahyna@st.mff.cuni.cz>
List: netbsd-help
Date: 07/06/2006 17:39:15
On Thu, Jul 06, 2006 at 10:56:14AM -0000, David Lord wrote:
> On 5 Jul 2006, at 23:07, Christos Zoulas wrote:
> 
> > In article <44AC3080.15957.784ADE@localhost>,
> > David Lord <david@lordynet.org> wrote:
> > >
> > >I have four pcs running NetBSD 3.0. On reboot I'm regularly seeing a
> > >kernel panic from three of them, either during bootup of within a few
> > > minutes later. The pc that doesn't seem to show the problem is a p4-
> > >2000, whilst slowest an amd586-133 is most likely to give the panic,
> > >maybe 1 in 3 reboots. The k6-400 has managed uptime of 41 days and
> > >then had to be rebooted a couple of times to use a new kernel. I've
> > >seen the problem with GENERIC kernel but needed some extra options
> > >and took the oportunity to remove others that weren't relevant.
> > >
> > >Error is same:
> > >kernel: page fault trap, code=0
> > >Stopped at: netbsd:fr_derefrule+0x1c7: cmpl $0x2,0x88(%edx)
> > >
> > >Any ideas as to how to track this down?
> > 
> > You can compile the offending file adding -S -gstabs to the compile
> > line to produce an assembly file. Then you can objdump --disassemble
> > the kernel and locate the offending code. Finally locating the line by
> > matching the assembly instructions in the .s file should give you the
> > c code line number from the stabs. From there, you can look at the c
> > code and possibly add some debugging to help you track the problem
> > down.
> 
> OK, thanks for that. First I didn't even know what the format of the 
> error message was other than to me it appeared to refer to a routine 
> and possibly give the instruction that caused the problem.
> 
> I've just run grep on the kernel source and only found fr_derefrule 
> to be present in several files in /usr/src/sys/dist/ipf.
> 
> It will be next week before I get time to compile to get the assembly 
> code. I've also downloaded more recent ipf and will compare with 
> that.

It should be enough to compile the kernel with makeoptions DEBUG="-g".
You'll get a netbsd.gdb file which can be used to extract line number
information, like

gdb netbsd.gdb
(gdb) list *(fr_derefrule+0x1c7)

(or whathever address is shown in the panic message)

Should be simpler than disassembling..

Oh, and when this happen please also do bt/l in ddb and note the output.
You can also try to save a kernel crash dump with a
reboot 0x104
command (in ddb).

Pavel