netbsd-bugs: Re: kern/29670: lockmgr: locking against myself

Subject: Re: kern/29670: lockmgr: locking against myself
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Ken Raeburn <raeburn@raeburn.org>
List: netbsd-bugs
Date: 03/21/2005 18:38:01
The following reply was made to PR kern/29670; it has been noted by GNATS.

From: Ken Raeburn <raeburn@raeburn.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/29670: lockmgr: locking against myself
Date: Mon, 21 Mar 2005 13:37:27 -0500

 Another crash (with 2.0+vfs_lookup.c:1.57) in pretty much the same  
 place, yesterday morning; here's the stack trace pulled out of the  
 message buffer:
 
 tlp2: transmit underrun; new threshold: 96/256 bytes
 <3>vnode: table is full - increase kern.maxvnodes or NVNODE
 panic: lockmgr: locking against myself
 Begin traceback...
 _lockmgr(d2eb1120,10002,d2eb1098,c071e6c0,144) at netbsd:_lockmgr+0xeb
 genfs_lock(d220bbf4,0,d220bc2c,1,c05b3600) at netbsd:genfs_lock+0x25
 VOP_LOCK(d2eb1098,10002,0,10042,10002) at netbsd:VOP_LOCK+0x28
 vn_lock(d2eb1098,10002,c06c6dba,d200ab3c,d200ab3c) at  
 netbsd:vn_lock+0x8d
 vget(d2eb1098,10002,f0,ce368417,cda76ca0) at netbsd:vget+0x9c
 cache_lookup(ce9d0b5c,d220be84,d220be98,0,154) at  
 netbsd:cache_lookup+0x2de
 ufs_lookup(d220bd94,ce9d0b5c,d220be84,d220be98,c05b3600) at  
 netbsd:ufs_lookup+0xc1
 layer_lookup(d220bd94,d220be98,d220bdac,c039952e,c05b2ec0) at  
 netbsd:layer_lookup+0x57
 VOP_LOOKUP(ced919dc,d220be84,d220be98,d220be84,ce368400) at  
 netbsd:VOP_LOOKUP+0x2e
 lookup(d220be74,ce368400,400,d220be8c,c07de240) at netbsd:lookup+0x201
 namei(d220be74,b2280000,e38,0,81094a4) at netbsd:namei+0x138
 sys___stat13(d0e0c08c,d220bf64,d220bf5c,0,c040d463) at  
 netbsd:sys___stat13+0x58
 syscall_plain() at netbsd:syscall_plain+0x7e
 --- syscall (number 278) ---
 0x481a5cc7:
 End traceback...
 syncing disks...
 dumping to dev 0,1 offset 1164887
 dump 511 510 509 [...]
 
 The nameidata object passed to namei() is:
 
 $6 = {ni_dirp = 0x81094a4 <Address 0x81094a4 out of bounds>,
    ni_segflg = UIO_USERSPACE, ni_startdir = 0x0, ni_rootdir = 0xcbaba1c8,
    ni_vp = 0x0, ni_dvp = 0xced919dc, ni_pathlen = 76,
    ni_next = 0xce368418  
 "work-20050319.0521/krb5-current/src/kadmin/testing/krb5-test-root/ 
 krb5.conf", ni_loopcnt = 0, ni_cnd = {cn_nameiop = 0,
      cn_flags = 540740, cn_proc = 0xcda76ca0, cn_cred = 0xc189dd80,
      cn_pnbuf = 0xce368400  
 "/u1/k5build/autobuilder/work-20050319.0521/krb5-current/src/kadmin/ 
 testing/krb5-test-root/krb5.conf",
      cn_nameptr = 0xce36840c  
 "autobuilder/work-20050319.0521/krb5-current/src/kadmin/testing/krb5- 
 test-root/krb5.conf", cn_namelen = 11, cn_hash = 375030947,
      cn_consume = 0}}
 
 (where /u1 is a 'null' mount from another local, synchronous ffs file  
 system)
 
 The lock passed to _lockmgr() is:
 
 $10 = {lk_interlock = {lock_data = 1,
      lock_file = 0xc06c4814 "../../../../kern/kern_lock.c",
      unlock_file = 0xc06c4814 "../../../../kern/kern_lock.c", lock_line  
 = 512,
      unlock_line = 858, list = {tqe_next = 0xc07e7aa4, tqe_prev =  
 0xc07625c0},
      lock_holder = 0}, lk_flags = 1024, lk_sharecount = 0,
    lk_exclusivecount = 1, lk_recurselevel = 0, lk_waitcount = 0,
    lk_wmesg = 0xc06c701b "vnlock", lk_un = {lk_un_sleep = {
        lk_sleep_lockholder = 21731, lk_sleep_locklwp = 1, lk_sleep_prio  
 = 20,
        lk_sleep_timo = 0}, lk_un_spin = {lk_spin_cpu = 21731,  
 lk_spin_list = {
          tqe_next = 0x1, tqe_prev = 0x14}}},
    lk_lock_file = 0xc071e6c0 "../../../../miscfs/genfs/genfs_vnops.c",
    lk_unlock_file = 0xc071e6c0 "../../../../miscfs/genfs/genfs_vnops.c",
    lk_lock_line = 324, lk_unlock_line = 340}
 
 Looks much like the previous crash.  Though my machine survived the  
 cron job run this morning, so maybe it's not as frequent a problem.
 
 
 As for the "vnode: table is full" problem, in getnewvnode, it looks  
 like the one entry in the vnode_free_list has VXLOCK set, so  
 getcleanvnode won't return it.  numvnodes (28891) is a bit higher than  
 desiredvnodes (23526), and there is an entry in the free list, so I  
 suspect tryalloc is being set to 0.  (vnode_hold_list also appears to  
 have one entry only.)  Maybe tryalloc should be set to 1 if there are  
 fewer than NCPUS items on the free list, in light of the comment  
 indicating that that many might normally be locked?  (This system has a  
 single P4 processor.)