Subject: kern/7954: nullfs panic when accessing front and back layers simultaneously
To: None <gnats-bugs@gnats.netbsd.org>
From: None <apb@iafrica.com>
List: netbsd-bugs
Date: 07/10/1999 04:56:55
>Number: 7954
>Category: kern
>Synopsis: nullfs panic when accessing front and back layers simultaneously
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jul 10 04:35:01 1999
>Last-Modified:
>Originator: Alan Barrett
>Organization:
not much
>Release: NetBSD-current 1999-07-09
>Environment:
System: NetBSD apb.iafrica.com 1.4E NetBSD 1.4E (APB) #0: Fri Jul 9 17:13:53 SAST 1999 apb@apb.iafrica.com:/b/USR/src/sys/arch/i386/compile/APB i386
>Description:
Despite Bill Studenmund's recent report that "nullfs now works", it
still doesn't seem to handle simultaneous access to the front and back
layers.
My /usr filesystem is actually a nullfs mount from /b/USR, where /b
is an ordinary ffs mount from a disk. I am able to trigger a panic
"locking against myself" quite easily by simultaneously running "ls -lR
/usr" and "ls -lR /b/USR".
Here's some information about the panic (copied by hand, so
there might be transcription errors):
panic: lockmgr: locking against myself
Stopped in ls at Debugger+0x4: leave
db> t
Debugger(f9258e10,0,f92d8c8c,f92d7c04,f012b922) at Debugger+0x4
panic(f0237660,10002,f044b800,0,0) at panic+0x55
lockmgr(f9258e10,10002,f921345c,f92133d0,f044b800) at lockmgr+0x2ee
layer_lock(f92d7c64) at layer_lock+0x4c
vclean(f92133d0,8,f92d8c8c) at vclean+0x55
vgonel(f92133d0,f92d8c8c) at vgonel+0x3b
getnewvnode(1,f044b800,f0442e00,f92d7d00,f9258d80) at getnewvnode+0x11d
ffs_vget(f044b800,1403de,f92d7d98,f92d7ea0,3) at ffs_vget+0x68
ufs_lookup(f92d7e00,f9258d80,f92d7eb4,f92d7e90,f015533f) at ufs_lookup+0xd3e
lookup(f92d7e90,f92d7f88,f92d8c8c,f92d7f88,f9144288) at lookup+0x24c
namei(f92d7e90,f92d7f88,f92d8c8c,f92d7f80,80b3840) at namei+0x313
sys___lstat13(f92d8c8c,f92d7f88,f92d7f80,0,80a2984) at sys___lstat13+0x44
syscall() at syscall+0x23a
--- syscall (number 280) ---
0x8061805:
db> ps/w
PID COMMAND EMUL PRI UTIME STIME WAIT-MSG WAIT-CHANNEL
>15872 ls netbsd 73 0.5 0.6
15871 ls netbsd 26 0.5 0.8 ttyout 0xf91103d8
[... more processes not shown ...]
db> ps
PID PPID PGRP UID S FLAGS COMMAND WAIT
>15872 15463 15872 0 2 0x4006 ls
15871 15463 15871 0 3 0x4086 ls ttyout
[... more processes not shown ...]
>How-To-Repeat:
: let /b be the mountpoint of an ordinary FFS filesystem.
: let /usr be an empty directory on the root filesystem.
: let /b/USR be a directory tree that contains everything \
that one would normally expect to live in /usr.
mount -t null /b/USR /usr
ls -lR /usr & ls -lR /b/USR
: wait for it to panic
>Fix:
Fix the locking protocol or implementation?
>Audit-Trail:
>Unformatted: