netbsd-bugs: kern/36331: MP deadlock resembling ufs

Subject: kern/36331: MP deadlock resembling ufs_hashlock deadlock
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <perseant@netbsd.org>
List: netbsd-bugs
Date: 05/14/2007 21:05:00
>Number:         36331
>Category:       kern
>Synopsis:       MP deadlock between ufs_ihashget() and VOP_LOOKUP()
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 14 21:05:00 +0000 2007
>Originator:     Konrad Schroder
>Release:        NetBSD 4.99.19, -current as of 2007-05-09
>Organization:
>Environment:
System: NetBSD nene.hitl.washington.edu 4.99.19 NetBSD 4.99.19 (NENE.MP) #11: Tue May  9 13:29:44 PDT 2007 perseant@wetbar.hitl.washington.edu:/alt/src-current/sys/arch/i386/compile/obj.i386/NENE.MP i386
Architecture: i386
Machine: i386
>Description:
I've been chasing down what I had thought was an LFS bug until I isolated
it last night, and it looks more like a multiprocessing locking bug to me.
Any thoughts on what is going on here would be helpful.  In the debugger I
am seeing this:

db{0}> ps
PID     PPID            PGRP    UID S  FLAGS  LWPs      COMMAND   WAIT
739     924             924       0 2   0       1       sh        tstile
660     750             750       0 2   0       1       sh        vnlock

There are other processes, of course, but these are the interesting ones.
Everything else is either sleeping in its normal sleep state, or wedged in
tstile or vnlock.  My instrumentation tells me that process 660 is holding
the ufs_hashlock, which if I look, it is:       

ddb{0}> show all proc/a
PID     COMMAND         STRUCT PROC *
660     sh              0xccbd21e0      ...
                        ^^^^^^^^^^
ddb{0}> x/x ufs_hashlock		(lock holder lwp is the first member)
netbsd:ufs_hashlock:    ccbd5502
ddb{0}> x/x ccbd5502+0xa2		(find struct proc * in struct lwp)
0xccbd55a4:     ccbd21e0
                ^^^^^^^^

The interesting thing is that 660 is waiting for the vnlock that 739 is
holding, while 739 is waiting for the ufs_hashlock that 660 seems to be
holding:

db{0}> trace/t 0t739
trace: pid 739 lid 1 at 0xcc7126dc
sleepq_switch(0,0,c03c8734,c039afe4,c03c1250) at netbsd:sleepq_switch+0x5b
mutex_vector_enter(c03c8734,2,c1f98d00,cc7127d4,0) at netbsd:mutex_vector_enter$
ffs_vget(c20c8000,131ca80,0,cc712894,0) at netbsd:ffs_vget+0x96
ufs_lookup(cc7128cc,0,c026e690,cbad4cb0) at netbsd:ufs_lookup+0x7a
VOP_LOOKUP(cf50a014,cc712b50,cc712b64,cc712b3c,s0) at netbsd:VOP_LOOKUP+0x29
lookup()   ^^^^^^^^
namei()
check_exec()
execve1()
sys_execve()
syscall_plain()

db{0}> trace/t 0t660
trace: pid 660 lid 1 at 0xcc5d60c
sleepq_switch(0,0,cf50a09c,c03551de,0) at netbsd:sleepq_switch+0x5b
ltsleep(cf50a09c,14,c03551de,0,cf50a09c) at netbsd:ltsleep+0x14a
acquire(0,600,c01cb0b6,2e3,c0234d14) at netbsd:acquire+0x104
_lockmgr(cf50a09c,10002,cf50a014,c03754d4,94c) at netbsd:_lockmgr+0x84a
ufs_lock(cc5cd780,c03cee22,3e2,c03d1780,ccbd5500) at netbsd:ufs_lock+0x3a
VOP_LOCK(cf50a014,100002,c,c03cee20,10002) at netbsd:VOP_LOCK+0x23
vn_lock(cf50a014,10002,0,c,cf50a014) at netbsd:vn_lock+0x96
vget(cf50a014,10002,94,0,c03c1250) at netbsd:vget+0x74
     ^^^^^^^^
ufs_ihashget(4,2,0,2,c03c8734) at netbsd:ufs_ihashget+0x94
ffs_vget()
ufs_root()
lookup()
namei()
check_exec()
execve1()
sys_execve()
syscall_plain()

...but if you look in the code, ufs_ihashget() drops the hashlock before
calling vget().  This looks to me like either a lost wakeup or a compiler
misoptimization.

This system is running NetBSD/i386 MP, -current as of May 9 (4.99.19, and
a kernel that is a trimmed-down version of GENERIC.MP with some local mods
in the LFS bits, and instrumented so I can easily see who's waiting for
whom for the ufs_hashlock).

>How-To-Repeat:
I was running an LFS test, similar to this one:

	#!/bin/sh

	while true
	do
		newfs_lfs /dev/rwhatever
		mount /dev/whatever /lfs
		(cd /lfs; tar zxf /somewhere/pkgsrc.tar.gz)
		(cd /lfs; tar zxf /somewhere/pkgsrc.tar.gz)
		rm -rf /lfs/*
		(cd /lfs; tar zxf /somewhere/pkgsrc.tar.gz)
		sync
		umount /lfs
	done

After a few hours this will generally wedge the machine with the same symptoms.

>Fix:
I don't have a fix, not being adept enough at reading assembly code to
know whether what's really going on is an optimization bug, a lost wakeup,
or something else entirely.