Subject: Re: bug kern/5026
To: Frank van der Linden <frank@wins.uva.nl>
From: Greg Wohletz <greg@lonnie.egr.unlv.edu>
List: current-users
Date: 05/07/1998 12:17:20
OK, I've made some progress in tracking down what is going on with this
panic.  Here is the code segment that triggers the panic (from nfs_serv.c,
near the end of the nfsrv_rename routine):

        vrele(tond.ni_startdir);
        FREE(tond.ni_cnd.cn_pnbuf, M_NAMEI);
out1:           
        if (fdirp) {
                fdiraft_ret = VOP_GETATTR(fdirp, &fdiraft, cred, procp);
                vrele(fdirp);
        }
        if (tdirp) {
                tdiraft_ret = VOP_GETATTR(tdirp, &tdiraft, cred, procp);
                vrele(tdirp);
        }
        vrele(fromnd.ni_startdir);    <--------- this call triggers the panic
        FREE(fromnd.ni_cnd.cn_pnbuf, M_NAMEI);
	nfsm_reply(2 * NFSX_WCCDATA(v3));


Now I noted from the crash dumps that tond.ni_startdir was always equal
to fromnd.ni_startdir when the crash occured, so I placed the following
debuging code right in from of the 1st vrele call:

        if(tond.ni_startdir == fromnd.ni_startdir) {
                Error_refcnt2 = fromnd.ni_startdir->v_usecount;
        } else {
            Error_refcnt2 = -999; 
        }       


Then when the system paniced the next time I inspected the value of
Error_refcnt2, and sure enough it was 1.  Clearly if the kernel gets
to the 1st vrele with those two pointers equal and the ref count
set to 1 a panic is inevitable since vrele is about to be called
twice on that vnode.

Now the question is how does the kernel get into this state.  Hopefully
what I have discovered will help someone to find the cause, meanwhile
I will continue to stumble forward as best I can with my limited
understanding of the inner workings of the vnode code.


For anyone that is interested I have placed the latest crash dump in

http://www.cs.unlv.edu/~greg/netbsd/

That directory contains several crash dumps and kernels.  Dump #9 is the
one that was generated after I inserted the debugging code.  nfs_serv.c is
copy of that code with my debugs in it so that gdb line numbers will make
sense to anyone who wants to look at them.


						--Greg