Subject: More data. Re: kernel panic in nfs_reclaim (kern/17107)
To: Christos Zoulas <christos@zoulas.com>
From: Artem Belevich <art@riverstonenet.com>
List: tech-kern
Date: 10/02/2002 11:23:40
I've got the panic tonight and I still have machine in DDB.  I think I
can keep it this way for couple more hours. So if somebody would like
to get more info from DDB - I'd be happy to type commands for you.

Here's the stack trace. This time from 1.6 GENERIC_DIAGNOSTIC kernel.

nfs_reclaim(e6200c54,8,0,c02a6953,e47dcc9c) at nfs_reclaim+0x54
VOP_RECLAIM(e4cd70f4,e3c42740,200000,0) at VOP_RECLAIM+0x2e
vclean(e4cd70f4,8,e3c42740,c025eb3c) at vclean+0x107
vgonel(e4cd70f4,e3c42740,0,c026034e) at vgonel+0x46
getnewvnode(1,c10a4200,c0f7ef00,e6200d4c,0) at getnewvnode+0x210
ffs_vget(c10a4200,56b198,e6200dd8,e3c42740,e58bfcb4) at ffs_vget+0x4f
ufs_lookup(e6200e10,30002,e6200e20,c02b14f9,e6200ef8) at ufs_lookup+0x74a
VOP_LOOKUP(e58bfcb4,e6200f08,e6200f1c,c02aac3a,e58bfcb4) at VOP_LOOKUP+0x35
lookup(e6200ef8,e758d000,400,e6200f10,e6200f80) at lookup+0x2a4
namei(e6200ef8,e57fd77c,e6200f1c,2) at namei+0x2f1
sys_unlink(e3c42740,e6200f80,e6200f78,c0375e0f) at sys_unlink+0x3f
syscall_plain(1f,1f,1f,1f,0) at syscall_plain+0xa7


I've checked the VNODE and v_data and v_mount pointers:

db> show vnode e4cd70f4
OBJECT 0xe4cd70f4: locked=0, pgops=0xc0663f64, npages=0, refs=0

VNODE flags 100<XLOCK>
mp 0xc1882200 numoutput 0 size 0xffffffffffffffff
data 0xe6e3fb98 usecount 0 writecount 0 holdcnt 0 numoutput 0
type VNON(0) tag VT_NFS(2) id 0xc3c7ed mount 0xc1882200 typedata 0x0

db> show object 0xe4cd70f4
OBJECT 0xe4cd70f4: locked=0, pgops=0xc0663f64, npages=0, refs=0

db> x 0xc0663f64
uvm_vnodeops:   0

v->v_data (nfsnode) seems to be OK. At least it points back to vnode
v->v_data->n_vnode == 0xe4cd70f4 

db> x/m 0xe6e3fb98,40
0xe6e3fb98:     50dd65c0 00000000 00000000 00000000     P.e.............
0xe6e3fba8:     00000000 00000000 552e50c0 ffffffff     ........U.P.....
0xe6e3fbb8:     08000000 00000000 00000000 00000000     ................
0xe6e3fbc8:     00000000 00000000 00000000 00000000     ................
0xe6e3fbd8:     00000000 00000000 00000000 00000000     ................
0xe6e3fbe8:     00000000 00000000 e0e928c1 00000000     ..........(.....
0xe6e3fbf8:     00000000 00000000 00000000 3cfce3e6     ............<...
0xe6e3fc08:     804d9be6 f470cde4 00000000 00000000     .M...p..........
0xe6e3fc18:     00000000 00000000 00000000 00000000     ................
0xe6e3fc28:     00000000 00000000 00000000 00000000     ................
0xe6e3fc38:     20000000 346e3700 321d8700 20000000      ...4n7.2... ...
0xe6e3fc48:     00376e34 321d8700 3b7c0000 21411a00     .7n42...;|..!A..
0xe6e3fc58:     6f4f0400 00000000 00000000 00000000     oO..............
0xe6e3fc68:     00000000 00000000 00000000 00000000     ................
0xe6e3fc78:     00000000 ffffffff 00000000 00000000     ................
0xe6e3fc88:     00000000 00000000 00000000 00000000     ................

Here comes v->v_mount pointer and the data doesn't look good to me.
Mount point has been freed and had type M_UVMAMAP (0x52==82).

db> x/m 0xc1882200,40
0xc1882200:     efbeadde 5200adde 00c688c1 efbeadde     ....R...........
0xc1882210:     efbeadde efbeadde efbeadde efbeadde     ................
0xc1882220:     08000000 09000000 0a000000 0b000000     ................
0xc1882230:     0c000000 0d000000 0e000000 0f000000     ................
0xc1882240:     10000000 11000000 12000000 13000000     ................
0xc1882250:     14000000 15000000 16000000 17000000     ................
0xc1882260:     18000000 19000000 1a000000 1b000000     ................
0xc1882270:     1c000000 1d000000 1e000000 1f000000     ................
0xc1882280:     20000000 21000000 22000000 23000000      ...!..."...#...
0xc1882290:     24000000 25000000 26000000 27000000     $...%...&...'...
0xc18822a0:     28000000 29000000 2a000000 2b000000     (...)...*...+...
0xc18822b0:     2c000000 2d000000 2e000000 2f000000     ,...-......./...
0xc18822c0:     30000000 31000000 32000000 33000000     0...1...2...3...
0xc18822d0:     34000000 35000000 36000000 37000000     4...5...6...7...
0xc18822e0:     38000000 39000000 3a000000 3b000000     8...9...:...;...
0xc18822f0:     3c000000 3d000000 3e000000 3f000000     <...=...>...?...

--Artem

On Mon, Sep 30, 2002 at 07:59:10PM -0400, Christos Zoulas <christos@zoulas.com> wrote:
> On Sep 30,  3:55pm, art@riverstonenet.com (Artem Belevich) wrote:
> -- Subject: Re: kernel panic in nfs_reclaim (kern/17107)
> 
> Is the rest of the vnode valid?
> 
> christos
> 
> | This was the first thing I tried. The kernel survived for a bit longer
> | - something like 3-4 days instead of usual nightly panic attack, but
> | finally it crashed in the same place with nmp=0xc. This suggests
> | that vnode's vp->v_mount has already been reused for something else.
> | 
> | This carsh confuses me a little - if filesystem is unmounted,
> | shouldn't all vnodes associated with it be gone? If so, then how comes
> | this particular rogue vnode was still around? 
> | 
> | --Artem
> | 
> | 
> -- End of excerpt from Artem Belevich
> 
>