Subject: ltsleep() calls in lfs are problematic
To: None <tech-kern@netbsd.org>
From: Blair Sadewitz <blair.sadewitz@gmail.com>
List: tech-kern
Date: 10/27/2007 00:21:32
Recently I tried running a -current installation on an LFS root
partition, and unfortunately I've been getting deadlocks/freezes again
(no panics thusfar).  Each time in ddb, the backtrace showed a call to
ltsleep() somewhere in the bowels of lfs.
Unfortunately, I forgot that ddb.tee_msgbuf is useless if /var isn't usable. ;)

Also, most of the backtraces I've seen also indicated adjustment of the spl.

I was wondering: if there's anything preventing us from converting
lfs_interlock to a mutex and perhaps getting rid of some of the spl()
wrapping.
And does anyone have any other ideas on how we might improve
synchronization in this beast?

Regards,

--Blair