Subject: Re: kern/36608: LFS related panic with LOCKDEBUG
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Sverre Froyen <sverre@viewmark.com>
List: netbsd-bugs
Date: 07/30/2007 23:05:05
The following reply was made to PR kern/36608; it has been noted by GNATS.
From: Sverre Froyen <sverre@viewmark.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/36608: LFS related panic with LOCKDEBUG
Date: Mon, 30 Jul 2007 16:03:04 -0600
In lfs_vnops.c there is a comment about genfs_putpages stating:
* (2) It needs to explicitly send blocks to be written when it is done.
* If VOP_PUTPAGES is called without the seglock held, we simply take
* the seglock and let lfs_segunlock wait for us.
* XXX There might be a bad situation if we have to flush a vnode while
* XXX lfs_markv is in operation. As of this writing we panic in this
* XXX case.
I have done a litle more investigation and I find that I consistently get a
double lock panic on the vnode(?) that is locked immediately before the call
to lfs_segunlock, around line 2290 in lfs_vnops.c:
simple_unlock(&vp->v_interlock);
simple_lock(&vp->v_interlock);
write_and_wait(fs, vp, busypg, seglocked, NULL);
*** vp is locked at this point
if (!seglocked) {
lfs_release_finfo(fs);
lfs_segunlock(fs);
*** I get the panic before the call to lfs_segunlock returns
}
sp->vp = NULL;
goto get_seglock;
It looks like lfs_segunlock is sleeping in the second while loop in this code
snippet from lfs_subr.c:
simple_lock(&fs->lfs_interlock);
while (ckp && sync && fs->lfs_iocount)
(void)ltsleep(&fs->lfs_iocount, PRIBIO + 1,
"lfs_iocount", 0, &fs->lfs_interlock);
while (sync && sp->seg_iocount) {
(void)ltsleep(&sp->seg_iocount, PRIBIO + 1,
"seg_iocount", 0, &fs->lfs_interlock);
DLOG((DLOG_SEG, "sleeping on iocount %x == %d\n", sp,
sp
->seg_iocount));
}
simple_unlock(&fs->lfs_interlock);
I do not know if the comment above refers to the case I'm seeing or not, but
while lfs_segunlock is sleeping some other code comes along and attempts to
lock the vnode that was locked in genfs_putpages.