Subject: kern/4965: lfs_bmapv and lfs_markv lossage
To: None <gnats-bugs@gnats.netbsd.org, perseant@hhhh.org>
From: None <perseant@hhhh.org>
List: netbsd-bugs
Date: 02/09/1998 19:50:52
>Number: 4965
>Category: kern
>Synopsis: vinvalbuf panics from within lfs_markv, and ufs_lock from lfs_bmapv
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Feb 9 20:05:02 1998
>Last-Modified:
>Originator: Konrad Schroder
>Organization:
-------------------------------------------------------------------------
Konrad Schroder http://www.hitl.washington.edu/people/perseant/
System Administrator perseant@hitl.washington.edu
Human Interface Technology Lab Voice: (206) 616-1478
Box 352142, University of Washington, 98195, USA FAX: (206) 543-5380
>Release: 1.3 (+ patches for kern/4641 and kern/4812)
>Environment:
System: NetBSD inle 1.3 NetBSD 1.3 (INLE) #77: Sun Feb 8 18:49:33 PST 1998 perseant@inle:/usr/src/sys/arch/i386/compile/INLE i386
>Description:
vclean() sets VXLOCK, then calls vinvalbuf.
vinvalbuf calls VOP_FSYNC, expecting that this will clear out all of
the dirty bufs associated with the vnode...in the LFS, this
eventually calls lfs_segwrite.
lfs_segwrite in turn calls vget() on all dirty nodes---unless
they're VXLOCKed---including the one that we're trying to
invalidate, and flushes them all to disk. Thinking its job is done,
it returns.
Then vinvalbuf panics because not all buffers got flushed from the
locked node, because, er, it was locked.
>How-To-Repeat:
Try using an LFS for a while with DIAGNOSTIC on, or...
[Create a 96 Mb LFS partition on sd1c -- I use a Zip disk]
newlfs -L /dev/rsd1c
mount -t lfs -o-n /dev/sd1c /lfs
(cd /usr/src; tar cf - gnu) | (cd /lfs; tar xf -)
[Cause a segment to be cleaned -- I've adapted lfs_cleanerd:
lfs_cleanerd -a /lfs ]
>Fix:
This solution involves marking the vnode with a special flag,
VLFSFLUSH, so that the LFS callback code ignores (well, works
around) the VXLOCK when flushing that particular vnode to disk.
When a check for VLFSFLUSH is added to lfs_vref() it also has to be
added to lfs_vunref(), in order not to vrele() too many times.
[Would it be useful to have an extra field in the vnode for
fs-specific flags like this (and VDIROP), even if it is not known
that an inode or equivalent exists?]
*** ./sys/vnode.h Mon Feb 9 18:27:09 1998
--- ./sys/vnode.h.fixed Sat Jan 31 00:00:27 1998
***************
*** 121,124 ****
--- 121,125 ----
#define VALIASED 0x0800 /* vnode has an alias */
#define VDIROP 0x1000 /* LFS: vnode is involved in a directory op */
+ #define VLFSFLUSH 0x2000 /* LFS: vnode is being flushed */
/*
*** ./ufs/lfs/lfs_segment.c Tue Jan 20 19:09:18 1998
--- ./ufs/lfs/lfs_segment.c.fixed.2 Sun Feb 8 18:42:23 1998
***************
*** 120,127 ****
struct lfs *fs;
struct segment *sp;
fs = VFSTOUFS(vp->v_mount)->um_lfs;
! if (fs->lfs_nactive > MAX_ACTIVE)
! return(lfs_segwrite(vp->v_mount, SEGM_SYNC|SEGM_CKP));
lfs_seglock(fs, SEGM_SYNC);
sp = fs->lfs_sp;
--- 120,134 ----
struct lfs *fs;
struct segment *sp;
+ int tmp;
+
+ /* Protect against VXLOCK deadlock in vinvalbuf() */
+ vp->v_flag |= VLFSFLUSH;
fs = VFSTOUFS(vp->v_mount)->um_lfs;
! if (fs->lfs_nactive > MAX_ACTIVE) {
! tmp = lfs_segwrite(vp->v_mount, SEGM_SYNC|SEGM_CKP);
! vp->v_flag &= ~VLFSFLUSH;
! return tmp;
! }
lfs_seglock(fs, SEGM_SYNC);
sp = fs->lfs_sp;
***************
*** 148,151 ****
--- 155,159 ----
#endif
lfs_segunlock(fs);
+ vp->v_flag &= ~VLFSFLUSH;
return (0);
}
***************
*** 1096,1100 ****
register struct vnode *vp;
{
!
if (vp->v_flag & VXLOCK)
return(1);
--- 1104,1116 ----
register struct vnode *vp;
{
! /*
! * If we return 1 here, we risk vinvalbuf() not being able to
! * flush all of the pages from this vnode, which will cause
! * a panic during lfs_markv. So, return 0 if that is in progress.
! */
! if (vp->v_flag & VLFSFLUSH) {
! vp->v_usecount++;
! return(0);
! }
if (vp->v_flag & VXLOCK)
return(1);
***************
*** 1108,1111 ****
--- 1124,1135 ----
extern int lfs_no_inactive;
+ /*
+ * We also don't want to vrele() here since that will be done
+ * again later, causing us serious problems.
+ */
+ if(vp->v_flag & VLFSFLUSH) {
+ vp->v_usecount--;
+ return;
+ }
/*
* This is vrele except that we do not want to VOP_INACTIVE
>Audit-Trail:
>Unformatted: