Subject: Re: LFS stability: kern/36608
To: Blair Sadewitz <blair.sadewitz@gmail.com>
From: Sverre Froyen <sverre@viewmark.com>
List: current-users
Date: 09/22/2007 09:47:47
On Thursday 13 September 2007, you wrote:
> I suspect it masks the problem.  Try this out:
>
> set vfs.lfs.pagetrip to something sane, such as ssize/4096/4.  Each
> time you decrease it (by powers of two), try untarring pkgsrc (this is
> on a multiprocessor machine, since we're talking about locking).  The
> last decrement in pagetrip should be ssize/PAGE_SIZE.  Notice how, if
> you get deadlocks, etc., they seem to happen more often as you go
> lower.  Now, try setting pagetrip to something one or two powers of
> two lower than ssize/PAGE_SIZE, such as 64.  Now notice how quickly it
> locks!

OK, dumpfs_lfs reports that ssize = 1048576 and sysctl says hw.pagesize = 
4096. Thus, ssize/4096/4 = 64 and ssize/PAGE_SIZE = 256, so I set 
vfs.lfs.pagetrip=64 (it was 0).

Running my bogofilter test case (see kern/36608) no longer triggers the 
LOCKDEBUG assertion.  I will resume regular use of bogofilter in my email 
client in order to test more thoroughly.

BTW, my system is a ThinkPad T42, i.e., a uniprocessor system.

> I suspected this has to do with the lfs writer daemon ltsleeping,
> given the time it sleeps is fixed.  Haven't gotten a chance to look at
> it more, though.

Do you think it is the lfs_writer daemon that makes the call to lfs_segunlock, 
around line 2290 in lfs_vnops.c (see the PR).  If so, your suspicion sounds 
reasonable.  How can I verify this?

> Any thoughts?

I still think the LFS code (see the PR) looks questionable but I've not gotten 
any feedback on my comments.

Thanks for your help!

Sverre