Subject: kern/36608: LFS related panic with LOCKDEBUG
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <sverre@viewmark.com>
List: netbsd-bugs
Date: 07/05/2007 02:30:00
>Number: 36608
>Category: kern
>Synopsis: Panic in LFS with LOCKDEBUG defined (since mid April)
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Jul 05 02:30:00 +0000 2007
>Originator: Sverre Froyen
>Release: NetBSD 4.99.22 (2007-07-04)
>Organization:
Viewmark
>Environment:
System: NetBSD abbor.fesk.com 4.99.22 NetBSD 4.99.22 (GENERIC_LAPTOP) #1: Wed Jul 4 09:10:15 MDT 2007 toor@abbor.fesk.com:/usr/src/sys/arch/i386/compile/GENERIC_LAPTOP i386 (GENERIC_LAPTOP has LOCKDEBUG defined)
Architecture: i386
Machine: i386
>Description:
I get reproducible LFS related panics when running the command
bogofilter -n < mailmessage
where mailmessage is a single plain email message.
In order for the panics to occur, LOCKDEBUG has to be defined and the
bogofilter database (.bogofilter/wordlist.db) has to be in a certain state.
The result is (copied from the screen):
switching with held simple_lock 0xcd155b2c CPU 0 ../../../../ufs/lfs/lfs_vnops.c:1742
0xccf9bd80:
Stopped in pid 679.1 (bogofilter) at netbsd:cpu_Debugger+0x4: popl
%
ebp
db> bt
cpu_Debugger(0,c07259a6,ccf7c9ac,c03bbf4a,c0815aa8) at netbsd:cpu_Debugger+0x4
simple_lock_switchcheck(c0815aa8,1,ccf7c9bc,c03cc255,c081f400) at netbsd:simple_lock_switchcheck+0x1b
mi_switch(ccf9bd80,11,ccf7c9dc,c03aed6f,c07b276a) at netbsd:mi_switch+0x2a
sleepq_block(0,0,b1,c0723def,c0795570) at netbsd:sleepq_block+0x10a
ltsleep(ccb30fa4,11,c0723def,0,c16e9754) at netbsd:ltsleep+0x151
lfs_segunlock(c16e9000,0,8f2,7fffffff,45c000) at netbsd:lfs_segunlock+0x224
lfs_putpages(ccf7cb30,ccefea84,1,c064cd40,cd155b2c) at netbsd:lfs_putpages+0xcb1
VOP_PUTPAGES(cd155b2c,0,0,0,0) at netbsd:VOP_PUTPAGES+0x40
lfs_fsync(ccf7cbb8,10002,ccf7cbdc,c040b48f,cd155b2c) at netbsd:lfs_fsync+0x13f
VOP_FSYNC(cd155b2c,cd00faa8,3,0,0) at netbsd:VOP_FSYNC+0x49
sys_fdatasync(ccf9bd80,ccf7cc48,ccf7cc68,640,ccf9bd80) at netbsd:sys_fdatasync+0x95
syscall_plain at netbsd:syscall_plain+0x116
--- syscall (number 241) ---
0xbb88c16b:
db>show lock ufs_hashlock
lock address : 0x00000000c0811554 type : sleep/adaptive
shared holds : 0 exclusive: 0
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x00000000ccf9bd80 last held: 000000000000000000
last locked : 0x00000000c034005a unlocked : 0x00000000c03400fe
owner field : 000000000000000000 wait/spin: 0/0
Rurnstile chain at 0xc0817040.
=> No active turnstile for this lock.
db>show lock ufs_ihash_lock
lock address : 0x00000000c081154c type : sleep/adaptive
shared holds : 0 exclusive: 0
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x00000000ccf9bd80 last held: 000000000000000000
last locked : 0x00000000c034ada7 unlocked : 0x00000000c03400f2
owner field : 000000000000000000 wait/spin: 0/0
Rurnstile chain at 0xc0817020.
=> No active turnstile for this lock.
db> ps/l
PID LID S FLAGS STRUCT LWP * UAREA * WAIT
>679 > 1 3 0x20000004 0xccf9bd80 0xccf7cce0 seg_iocount
...
It looks like the problem was introduced on 2007-04-17 or 18. Kernels
before and including 2007-04-16 do not panic. Kernels after and including
2007-04-19 panic consistently.
See
http://mail-index.netbsd.org/current-users/2007/05/21/0028.html
http://mail-index.netbsd.org/current-users/2007/05/25/0004.html
for more information.
>How-To-Repeat:
Use the known bad bogofilter DB:
cp .bogofilter/wordlist.db.bad .bogofilter/wordlist.db
Reboot with an affected kernel and type:
bogofilter -n < mailmessage
>Fix:
unknown