Subject: kern/16942: Bug in softdep
To: None <gnats-bugs@gnats.netbsd.org>
From: None <manu@netbsd.org>
List: netbsd-bugs
Date: 05/21/2002 01:32:36
>Number:         16942
>Category:       kern
>Synopsis:       Bug in softdep
>Confidential:   no
>Severity:       critical
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 21 01:33:00 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Emmanuel Dreyfus
>Release:        NetBSD-1.5.2/i386
>Organization:
The NetBSD Project
>Environment:
NetBSD mayday 1.5.3_ALPHA NetBSD 1.5.3_ALPHA (MAYDAY) #4: Mon Jan 14 12:48:39 CET 2002     manu@melancolie:/usr/src/sys/arch/i386/compile/MAYDAY i386
>Description:
When softdep is turned on, with some disk activity (about 45 clients mounting their home through Samba), the machine panics. By observing & dozen of crash dumps, it seems that it is always the same users that trigger the crash when loging in.

Here is the panic string and the backtrace:

panic: worklist_remove: not on list
#1  0xc0273e23 in cpu_reboot ()
#2  0xc01ac085 in panic ()
#3  0xc024e69a in acquire_lock ()
#4  0xc02533ee in softdep_fsync_mountdev ()
#5  0xc0257556 in ffs_full_fsync ()
#6  0xc02572d6 in ffs_fsync ()
#7  0xc0256334 in ffs_sync ()
#8  0xc01c8390 in sys_sync ()
#9  0xc01c7350 in vfs_shutdown ()
#10 0xc0273dfb in cpu_reboot ()
#11 0xc01ac085 in panic ()
#12 0xc024e90f in worklist_remove ()
#13 0xc0250876 in check_inode_unwritten ()
#14 0xc02507d7 in softdep_freefile ()
#15 0xc024bd3e in ffs_vfree ()
#16 0xc025ea04 in ufs_makeinode ()
#17 0xc025b566 in ufs_create ()
#18 0xc01cc97c in vn_open ()
#19 0xc01c8b52 in sys_open ()
#20 0xc027acf4 in syscall ()
#21 0xc0100ce7 in syscall1 ()


>How-To-Repeat:
Unfortunately, this machine is in production, I cannot crash it until 
I find what is exactly causing the problem.
>Fix:
Turning off softdep is a workaround to the problem.
>Release-Note:
>Audit-Trail:
>Unformatted: