Subject: Re: kern/20914: kernel panic in sysctl_procargs()
To: David Laight <david@l8s.co.uk>
From: Andrew Brown <atatat@atatdot.net>
List: netbsd-bugs
Date: 04/08/2003 12:14:38
>Ok, so the break is in this 'memcpy' - which the compiler has inlined
>to a single 'mov' instruction (at sysctl_procargs+0x1fd):
right.
>> /usr/src/sys/arch/i386/compile/CRASH/../../../../kern/kern_sysctl.c:2126
>> case KERN_PROC_ARGV:
>> /* XXX compat32 stuff here */
>> memcpy(&tmp, (char *)&pss + p->p_psargv, sizeof(tmp));
>> c030f590 <sysctl_procargs+0x1f4> 8b 7d bc mov 0xffffffbc(%ebp),%edi
>> c030f593 <sysctl_procargs+0x1f7> 8b 87 7c 01 00 00 mov 0x17c(%edi),%eax
>> c030f599 <sysctl_procargs+0x1fd> 8b 44 28 f0 mov 0xfffffff0(%eax,%ebp,1),%eax
>> /usr/src/sys/arch/i386/compile/CRASH/../../../../kern/kern_sysctl.c:2127
>> break;
>
>Now we don't have the registers of the panic (anyone fancy fixing ddb?)
and i can't get to that frame in gdb. :-/
>but the above should have extreme difficulty in exploding.
that's what i figured, unless one of the registers was wrong.
>p->p_psargv is a constant - I think it should be 0 for all processes
>and all the time. The only time it will be wrong (for a pointer that
>has been a valid proc pointer) is if the process has exited and the
>memory page released from the pool back for general use.
>
>Any thoughts?
>
>I don't see any need for p->p_psargv and friends - but I can't quite
>see how the value read can be invalid - even though it must be!
>
>OTOH this code is completly borked should the process actually exit!
it's safe to say my x server wasn't exiting. i've got the core, the
kernel file, and the netbsd.gdb file, but i can't step back up to the
frame in which that "call" resides. otoh, given:
uvm_fault(0xc06b71c0, 0xcfcf8000, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 eip c030f599 cs 8 eflags 10246 cr2 cfcf8e50 ilevel 0
panic: trap
Begin traceback...
trap() at trap+0x21a
--- trap (number 6) ---
sysctl_procargs(cfcd8f18,2,807b000,cfcd8f0c,cfcc8500) at sysctl_procargs+0x1fd
kern_sysctl(cfcd8f14,3,807b000,cfcd8f0c,0) at kern_sysctl+0x4b4
sys___sysctl(cfae3688,cfcd8f80,cfcd8f78,c0ac6340,bdbbd2e0) at sys___sysctl+0x1f2
syscall_plain(1f,1f,1f,1f,4) at syscall_plain+0xab
End traceback...
and
sysctl_procargs(int *name, u_int namelen, void *where, size_t *sizep,
struct proc *up)
i can dump arbitrary bits of "memory", like so:
(gdb) x/2x 0xcfcd8f18
0xcfcd8f18: 0x00000277 0x00000001
that's 0x277 (the x server), 0x1 (KERN_PROC_PID), and since name is a
pointer that gets passed in from kern_sysctl() thusly:
case KERN_PROC_ARGS:
return (sysctl_procargs(name + 1, namelen - 1,
oldp, oldlenp, p));
(gdb) x/3x 0xcfcd8f14
0xcfcd8f14: 0x00000030 0x00000277 0x00000001
the 0x30 is KERN_PROC_ARGS, and since it, in turn, gets called from
sys___sysctl() like so:
error = (*fn)(name + 1, SCARG(uap, namelen) - 1, SCARG(uap, old),
oldlenp, SCARG(uap, new), SCARG(uap, newlen), p);
and since name is a variable local to sys___sysctl() (whose address is
apparently 0xcfcd8f10), i should be able to find the sys___sysctl()
stack frame somewhere around 0xcfcd8f10. let's see...pointer,
pointer, int, a couple of size_ts, another pointer, an array of ints
called name (CTL_MAXNAME aka 12 in length), and another
pointer...that's 19 x 4 (this is i386), so we look at this:
(gdb) x/48x 0xcfcd8f00
0xcfcd8f00: 0xc030cae0 0x00040000 0xcfcc8500 0x00040000
0xcfcd8f10: 0x00000001 0x00000030 0x00000277 0x00000001
0xcfcd8f20: 0x0000002e 0x00000000 0xcfcd8f68 0xc0b9b000
0xcfcd8f30: 0x0180d00a 0x2f83d00a 0x00000000 0x00000004
0xcfcd8f40: 0xcfcd8fa0 0xc03a1e3b 0xcfae3688 0xcfcd8f80
0xcfcd8f50: 0xcfcd8f78 0xc0ac6340 0xbdbbd2e0 0xbfbfedbc
0xcfcd8f60: 0x00000004 0x00000000 0x00000006 0xc043f464
0xcfcd8f70: 0xcfcd8fa0 0xc0699fb0 0x00000000 0x00000000
0xcfcd8f80: 0xbfbfedbc 0x00000004 0x0807b000 0xbfbfedb8
0xcfcd8f90: 0x00000000 0x00000000 0xc0ac8500 0xc0ac8500
0xcfcd8fa0: 0xbfbfed5c 0xc0100c6b 0x0000001f 0x0000001f
0xcfcd8fb0: 0x0000001f 0x0000001f 0x00000004 0xbfbfedbc
now...the { 0x1, 0x30, 0x277, 0x1 } is the name array (where the
initial 0x1 is presumably CTL_KERN). going back to the ddb stackdump
that says the fourth argument to sysctl_procargs() was 0xcfcd8f0c,
that must be the address of sizep (known here as oldlen), the 0x40000
value. since i've dumped the vmspaces of all the processes from the
kernel core, i can say that the ps process was 334 which has a paddr
(according to ps) of cfcc8500. that value appears at 0xcfcd8f08 in
the dump above. so...where's the thing that i feed to gdb to step to
that stack frame?
--
|-----< "CODE WARRIOR" >-----|
codewarrior@daemon.org * "ah! i see you have the internet
twofsonet@graffiti.com (Andrew Brown) that goes *ping*!"
werdna@squooshy.com * "information is power -- share the wealth."