Subject: kern/10448: softdep panics under heavy load at lockmgr
To: None <gnats-bugs@gnats.netbsd.org>
From: IWAMOTO Toshihiro <iwamoto@sat.t.u-tokyo.ac.jp>
List: netbsd-bugs
Date: 06/25/2000 22:42:23
>Number: 10448
>Category: kern
>Synopsis: softdep panics under heavy load at lockmgr
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jun 25 22:43:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: IWAMOTO Toshihiro <iwamoto@sat.t.u-tokyo.ac.jp>
>Release: NetBSD-current of 2 weeks ago, also confirmed on recent 1.5_ALPHA
>Organization:
Univ. of Tokyo
>Environment:
System: NetBSD yomogi.sat.t.u-tokyo.ac.jp 1.4ZA NetBSD 1.4ZA (YOMOGI) #18: Mon Jun 12 17:50:29 JST 2000 iwamoto@yomogi.sat.t.u-tokyo.ac.jp:/usr/src/syssrc/sys/arch/i386/compile/YOMOGI i386
>Description:
With softdep enabled and heavy filesystem activity,
ffs_vget() can call itself, resulting in the following panic.
As shown below, the detailed situation of this panic is:
1. ffs_vget() calls getnewvnode() with ufs_hashlock locked
2. getnewvnode() calls vgonel() to recycle a vnode entry
3. flush_pagedep_deps() needs to call ffs_vget() to add
directory entries "." and ".." before flushing the vnode
4. ufs_hashlock is already locked
#10 0xc015ecc9 in panic (fmt=0xc0304380 "lockmgr: locking against myself")
at ../../../../kern/subr_prf.c:219
#11 0xc0151296 in lockmgr (lkp=0xc03b67e0, flags=34, interlkp=0x0)
at ../../../../kern/kern_lock.c:518
#12 0xc023c43f in ffs_vget (mp=0xc08c3600, ino=1075358, vpp=0xd88f3a64)
at ../../../../ufs/ffs/ffs_vfsops.c:968
#13 0xc02ea224 in flush_pagedep_deps (pvp=0xd8765448, mp=0xc08c3600,
diraddhdp=0xc0a675b0) at ../../../../../gnu/sys/ufs/ffs/ffs_softdep.c:4296
#14 0xc02e9cad in softdep_sync_metadata (v=0xd88f3b50)
at ../../../../../gnu/sys/ufs/ffs/ffs_softdep.c:4026
#15 0xc023d45d in ffs_fsync (v=0xd88f3b50)
at ../../../../ufs/ffs/ffs_vnops.c:312
#16 0xc0176f16 in vinvalbuf (vp=0xd8765448, flags=1, cred=0xffffffff,
p=0xd88ec018, slpflag=0, slptimeo=0) at ../../../../sys/vnode_if.h:449
#17 0xc0177bfd in vclean (vp=0xd8765448, flags=8, p=0xd88ec018)
at ../../../../kern/vfs_subr.c:1424
#18 0xc0177ddb in vgonel (vp=0xd8765448, p=0xd88ec018)
at ../../../../kern/vfs_subr.c:1554
#19 0xc0176cbd in getnewvnode (tag=VT_UFS, mp=0xc08c3600, vops=0xc0874200,
vpp=0xd88f3c4c) at ../../../../kern/vfs_subr.c:508
#20 0xc023c45e in ffs_vget (mp=0xc08c3600, ino=1394840, vpp=0xd88f3cb0)
at ../../../../ufs/ffs/ffs_vfsops.c:971
#21 0xc0235dae in ffs_valloc (v=0xd88f3cb4)
at ../../../../ufs/ffs/ffs_alloc.c:605
#22 0xc024c700 in ufs_makeinode (mode=33188, dvp=0xd8793014, vpp=0xd88f3ee8,
cnp=0xd88f3efc) at ../../../../sys/vnode_if.h:933
#23 0xc0249a82 in ufs_create (v=0xd88f3e08)
---Type <return> to continue, or q <return> to quit---
at ../../../../ufs/ufs/ufs_vnops.c:117
#24 0xc017df98 in vn_open (ndp=0xd88f3ed8, fmode=1550, cmode=420)
at ../../../../sys/vnode_if.h:96
#25 0xc017a3ae in sys_open (p=0xd88ec018, v=0xd88f3f88, retval=0xd88f3f80)
at ../../../../kern/vfs_syscalls.c:1004
#26 0xc02699e7 in syscall (frame={tf_es = 134873119, tf_ds = -1078001633,
tf_edi = 0, tf_esi = 134882304, tf_ebp = -1077946548, tf_ebx = 1549,
tf_edx = -1, tf_ecx = -35, tf_eax = 5, tf_trapno = 3, tf_err = 2,
tf_eip = 134730351, tf_cs = 23, tf_eflags = 514, tf_esp = -1077946912,
tf_ss = 31, tf_vm86_es = 0, tf_vm86_ds = 0, tf_vm86_fs = 0,
tf_vm86_gs = 0}) at ../../../../arch/i386/i386/trap.c:766
#27 0xc0100dc1 in syscall1 ()
can not access 0xbfbfd74c, invalid translation (invalid PDE)
can not access 0xbfbfd74c, invalid translation (invalid PDE)
>How-To-Repeat:
To reproduce the above panic, an incomplete directory vnode
should be vgonel()'ed from getnewvnode().
Such a situation can be easily reproducable.
1. Boot with a large amount (~50MB) of buffers and the default
value of numvnodes (778).
2. Generate a heavy disk activity by issueing the following commands
simultaneously.
$ cd /usr/pkgsrc/www/mozilla; make (tarball extraction)
# cd /usr/src/syssrc; cvs update -Pd
>Fix:
A workaround is to configure a large number of vnodes.
I think a lock to ufs_hashlock is unnecessary when calling
getnewvnode(), and the patch below seems to work.
Am I missing the point?
--- /sys/ufs/ffs/ffs_vfsops.c Sun Jun 25 16:57:51 2000
+++ ./ffs_vfsops.c Mon Jun 26 13:47:58 2000
@@ -1014,15 +1014,16 @@
} while (lockmgr(&ufs_hashlock, LK_EXCLUSIVE|LK_SLEEPFAIL, 0));
/* Allocate a new vnode/inode. */
+ lockmgr(&ufs_hashlock, LK_RELEASE, 0);
if ((error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp)) != 0) {
*vpp = NULL;
- lockmgr(&ufs_hashlock, LK_RELEASE, 0);
return (error);
}
/*
* XXX MFS ends up here, too, to allocate an inode. Should we
* XXX create another pool for MFS inodes?
*/
+ lockmgr(&ufs_hashlock, LK_EXCLUSIVE|LK_SLEEPFAIL, 0);
ip = pool_get(&ffs_inode_pool, PR_WAITOK);
memset((caddr_t)ip, 0, sizeof(struct inode));
vp->v_data = ip;
>Release-Note:
>Audit-Trail:
>Unformatted: