Subject: Re: Panic killing a process
To: Julio M. Merino Vidal <jmmv@menta.net>
From: Pavel Cahyna <pavel.cahyna@st.cuni.cz>
List: current-users
Date: 01/08/2005 14:59:56
On Sat, 08 Jan 2005 11:13:21 +0100, Julio M. Merino Vidal wrote:

> On Sat, 8 Jan 2005 12:05:32 +1030
> Greg 'groggy' Lehey <grog@NetBSD.org> wrote:
> 
>> You could start by looking at what's going on in frame 9.  Look at the
>> local variables.  On the face of it I'd guess a null pointer
>> dereference, but you should have messages from trap() telling you what
>> happens.
> 
> Yeah, well... the problem is that I shouldn't spend a lot of time in front
> of the computer these days...
> 
> Anyway, here is what I got:
> 
> The kernel fails due to (thanks to Pavel Cahyna for telling me about
> dmesg -M):
> uvm_fault(0xca728460, 0x97c14000, 0, 2) -> 0xe
> 
> Then, in frame 9 we have that the offending function is fdfree in
> kern_descript.c.  It fails in line 1284 on the call to knote_fdclose:
> 
> (gdb) frame 9
> #9  0xc021318d in fdfree (p=0xca72ce5c)
>     at /usr/src/sys/kern/kern_descrip.c:1284
> 1284                                    knote_fdclose(p, fdp->fd_lastfile - i);
> 
> Upon that point, frame 8 is already a trap, so knote_fdclose is not
> even reached, right?  Therefore the problem has to be in some of its
> parameters, that is, in the access to fdp.  Isn't it?

Please use "bt/l" in ddb next time you repeat this bug. I saw that
sometimes (maybe always?) one frame is missed by gdb, but shows in ddb
when the kernel panics with uvm_fault (which is usually a symptom of a
NULL-pointer dereference). See for example PR kern/28669:

#3  0xc035e5cd in trap (frame=0xce62b828)
    at ../../../../arch/i386/i386/trap.c:296
#4  0xc0102e9f in calltrap ()
#5  0xc012b152 in frpr_ipv6hdr (fin=0xce62b8f8)
    at ../../../../netinet/fil.c:464

and compare it with a ddb backtrace of the same bug from the duplicate PR
kern/28875:

		db> bt
		fr_coalesce
		frpr_ipv6hdr

Clearly the fr_coalesce function (where the panic really happens) is
missing from the gdb backtrace.

Bye	Pavel

P.S. I should really write a document like "best debugging practices" as
they are highly non-obvious.