NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/41671: fsck_ffs(1) can cause a locking error



>Number:         41671
>Category:       kern
>Synopsis:       fsck_ffs(1) can cause a locking error
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jul 06 04:15:00 +0000 2009
>Originator:     Anon Ymous
>Release:        NetBSD 5.99.15 / 2009.07.04.12.00.00
>Organization:
>Environment:
System: NetBSD t61.localnet 5.99.15 NetBSD 5.99.15 (T61) #1: Sun Jul 5 17:35:11 
EDT 2009 anon%t61.localnet@localhost:/s/NetBSD/obj/sys/arch/amd64/compile/T61 
amd64
Architecture: x86_64
Machine: amd64
>Description:
        At boot, "fsck -p" can cause a locking error.  Here this only
        happens on the root file system.  Christos said he doesn't see
        it at all.  VFS_MOUNT() needs to fail for the offending code
        to be reached (see patch below).

        The following was transcribed from a screen shot:

/dev/rwd0a: MARKING FILE SYSTEM CLEAN
Mutex Error: mutex_vector_enter: locking against myself

lock address : 0xffff80004faa3950
current cpu  :                  0
current lwp  : 0xffff8000512d4800
owner field  : 0xffff8000512d4800 wait/spin:                0/0

panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8022d555 cs 8 rflags 246 cr2  412d78 cpl 0 rsp 
ffff80005129d730
Stopped in pid 10.1 (fsck_ffs) at       netbsd:breakpoint+0x5:  leave
db{0}> bt
breakpoint() at netbsd:breakpoint+0x5
panic() at netbsd:panic+0x29a
lockdebug_abort() at netbsd:lockdebug_abort+0x3a
mutex_vector_enter() at netbsd:mutex_vector_enter+0x1fd
mountd_set_exports_list() at netbsd:mountd_set_exports_list+0xfd
nfs_export_update_30() at netbsd:nfs_export_update_30+0x39
vfs_hooks_reexport() at netbsd:vfs_hooks_reexport+0x4c
do_sys_mount() at netbsd:do_sys_mount+0x57e
sys___mount50() at netsbd:sys___mount50+0x33
syscall() at netbsd:syscall+0xaa
db{0}>

>How-To-Repeat:
        Shutdown the system leaving the root filesystem dirty (e.g.,
        kill the power).  When rebooting, the lock error occurs right
        after root is marked clean.

>Fix:
        Christos sent me the following patch which corrected the
        problem.  However, he asked me to file this PR as he did not
        think this was best fix (e.g., perhaps it is better to pass a
        flag not to acquire the lock).

Index: vfs_syscalls.c
===================================================================
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.396
diff -u -u -r1.396 vfs_syscalls.c
--- vfs_syscalls.c      2 Jul 2009 12:53:47 -0000       1.396
+++ vfs_syscalls.c      5 Jul 2009 21:10:44 -0000
@@ -217,7 +217,9 @@
                 * Update failed; let's try and see if it was an
                 * export request.  For compat with 3.0 and earlier.
                 */
+               mutex_exit(&mp->mnt_updating);
                error2 = vfs_hooks_reexport(mp, path, data);
+               mutex_enter(&mp->mnt_updating);
 
                /*
                 * Only update error code if the export request was



Home | Main Index | Thread Index | Old Index