Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Current panics



Manuel Bouyer <bouyer%antioche.eu.org@localhost> writes:

> On Fri, Nov 20, 2009 at 09:57:01PM +0100, Manuel Bouyer wrote:
>> On Fri, Nov 20, 2009 at 08:54:00PM +0300, Aleksej Saushev wrote:
>> > > I can't see from where the _atomic_inc_32 could come from in
>> > > vn_open(). Could you see what is the address of vn_open() in your kernel
>> > > (nm /netbsd |grep vn_open should tell) ?
>> > 
>> > c0462e4d B vn_open
>> > 
>> > Is there any way to associate addresses to line numbers?
>> 
>> The only way I know is with gdb, but this needs a kernel compiled with -g.
>> Here this would give:
>> (gdb) l *(vn_open+320) 
>> 0xc0665050 is in vn_open (/home/bouyer/HEAD/src/sys/kern/vfs_vnops.c:149).
>> 144             if (fmode & O_CREAT) {
>> 145                     ndp->ni_cnd.cn_nameiop = CREATE;
>> 146                     ndp->ni_cnd.cn_flags |= LOCKPARENT | LOCKLEAF;
>> 147                     if ((fmode & O_EXCL) == 0 &&
>> 148                         ((fmode & O_NOFOLLOW) == 0))
>> 149                             ndp->ni_cnd.cn_flags |= FOLLOW;
>> 150             } else {
>> 151                     ndp->ni_cnd.cn_nameiop = LOOKUP;
>> 152                     ndp->ni_cnd.cn_flags |= LOCKLEAF;
>> 153                     if ((fmode & O_NOFOLLOW) == 0)
>> 
>> So it doesn't make sense.
>> Could you send me your kernel config file (if it's not GENERIC) ?
>
> Wit the right kenrel config file I get:
> (gdb) l *(vn_open+320)
> 0xc0462edd is in vn_open (/home/bouyer/HEAD/src/sys/kern/vfs_vnops.c:159).
> 154                             ndp->ni_cnd.cn_flags |= FOLLOW;
> 155             }
> 156     
> 157             VERIEXEC_PATH_GET(ndp->ni_dirp, ndp->ni_segflg, ndp->ni_dirp, 
> path);
> 158     
> 159             error = namei(ndp);
> 160             if (error)
> 161                     goto out;
> 162     
> 163             vp = ndp->ni_vp;
>
> So it could be in namei(). If you still have your netbsd.gdb around
> could you confirm it (in my binary the address don't match exactly to
> what you reported) ?

Alright, I've just had another panic:

fatal page fault in supervisor mode
trap type 6 code 0 eip c0347585 cs 8 eflags 210202 cr2 0 ilevel 8

dumping to dev 0,1 offset 1592368


(gdb) target kvm netbsd.44.core
#0  cpu_reboot (howto=256, bootstr=0x0) at 
/usr/src/sys/arch/i386/i386/machdep.c:864
864             splx(s);
(gdb) bt
#0  cpu_reboot (howto=256, bootstr=0x0) at 
/usr/src/sys/arch/i386/i386/machdep.c:864
#1  0xc017f9b7 in db_sync_cmd (addr=-1070303867, have_addr=false, 
count=-1067765371, modif=0xcc791804 "╗H\027└")
    at /usr/src/sys/ddb/db_command.c:1375
#2  0xc01800fa in db_command (last_cmdp=0xc05a0f1c) at 
/usr/src/sys/ddb/db_command.c:909
#3  0xc018033d in db_command_loop () at /usr/src/sys/ddb/db_command.c:567
#4  0xc0185ca0 in db_trap (type=6, code=0) at /usr/src/sys/ddb/db_trap.c:101
#5  0xc0182c94 in kdb_trap (type=6, code=0, regs=0xcc791a2c) at 
/usr/src/sys/arch/i386/i386/db_interface.c:226
#6  0xc03dd0ee in trap (frame=0xcc791a2c) at 
/usr/src/sys/arch/i386/i386/trap.c:354
#7  0xc010cb3f in calltrap ()
#8  0xc0347585 in pmap_activate (l=0xcbead0e0) at 
/usr/src/sys/arch/x86/x86/pmap.c:2527
#9  0xc0283b50 in mi_switch (l=0xcbead0e0) at /usr/src/sys/kern/kern_synch.c:771
#10 0xc0280fd3 in sleepq_block (timo=0, catch=true) at 
/usr/src/sys/kern/kern_sleepq.c:262
#11 0xc03bd959 in sel_do_scan (fds=0xcc791ba0, nfds=1, ts=0x0, mask=0x0, 
retval=0xcc791d28, selpoll=0)
    at /usr/src/sys/kern/sys_select.c:253
#12 0xc03bdb71 in pollcommon (retval=0xcc791d28, u_fds=0xbb80c070, nfds=1, 
ts=0x0, mask=0x0)
    at /usr/src/sys/kern/sys_select.c:440
#13 0xc03bdc95 in sys_poll (l=0xcbead0e0, uap=0xcc791d00, retval=0xcc791d28) at 
/usr/src/sys/kern/sys_select.c:378
#14 0xc03bf736 in syscall (frame=0xcc791d48) at /usr/src/sys/sys/syscallvar.h:61
#15 0xc0100524 in syscall1 ()


I remember this "pmap_activate ... sys_poll" sequence, it was observed recently.

"bt full" reveals these details:

#8  0xc0347585 in pmap_activate (l=0xcbead0e0) at 
/usr/src/sys/arch/x86/x86/pmap.c:2527
        ci = (struct cpu_info *) 0x8001003b
        pmap = (struct pmap *) 0x0
#9  0xc0283b50 in mi_switch (l=0xcbead0e0) at /usr/src/sys/kern/kern_synch.c:771
        prevlwp = (struct lwp *) 0xcb12ec80
        ci = (struct cpu_info *) 0xc05a0640
        spc = (struct schedstate_percpu *) 0xc05a0688
        newl = <value optimized out>
        retval = <value optimized out>
        oldspl = 0
        bt = {sec = 19093, frac = 17388637305122208955}
        returning = false
#10 0xc0280fd3 in sleepq_block (timo=0, catch=true) at 
/usr/src/sys/kern/kern_sleepq.c:262
        error = <value optimized out>
        sig = <value optimized out>
        p = <value optimized out>
        l = (lwp_t *) 0xcbead0e0
        biglocks = 0
#11 0xc03bd959 in sel_do_scan (fds=0xcc791ba0, nfds=1, ts=0x0, mask=0x0, 
retval=0xcc791d28, selpoll=0)
    at /usr/src/sys/kern/sys_select.c:253
        ncoll = 19
        l = (lwp_t * const) 0xcbead0e0
        p = (proc_t * const) 0xcbea194c
        sc = (selcpu_t *) 0xcb113e80
        lock = (kmutex_t *) 0xcb12dcc0
        sleepts = {tv_sec = -4471120875112812544, tv_nsec = 4096}
        error = <value optimized out>
        timo = 0

> namei() does a VREF which is atomic_inc_uint(&vp->v_usecount).
> I guess vp is NULL; extracting dmesg
> from the core dump to get the fault address could confirm it.
>
> Now I don't have much clues about how namei works, so I can't easily see if
> something is wrong with vnode locking or reference count here.



-- 
HE CE3OH...


Home | Main Index | Thread Index | Old Index