Subject: kern/20676: lfscleaner pndirop deadlock
To: None <gnats-bugs@gnats.netbsd.org>
From: None <karkn443@student.liu.se>
List: netbsd-bugs
Date: 03/13/2003 00:09:39
>Number:         20676
>Category:       kern
>Synopsis:       lfscleaner pndirop deadlock
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 13 00:10:01 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Karl Knutsson
>Release:        NetBSD 1.6P
>Organization:
>Environment:
NetBSD love 1.6P NetBSD 1.6P (LFS_UBC_DEBUG) #6: Wed Mar 12 16:31:11 UTC 2003  wbagger@love:/usr/src/sys/arch/sparc/compile/LFS_UBC_DEBUG sparc
>Description:
In function lfs_fcntl at case LFCNRECLAIM the cleaner will aquire the
segment lock and then call lfs_flush_dirops. If there are any active dirops the cleaner will sleep while holding the segment lock. Meanwhile the process involved in the dirop will try to aquire the segment lock.

# ps -s 
UID  PID PPID CPU LID NLWP PRI NI VSZ  RSS WCHAN    STAT TT    TIME COMMAND
  0  178    1  14   1    1   3  0  48  704 ttyin    IW   ?? 0:00.09 /usr/libexec/getty suncons console 
  0  187  180   0   1    1  18  0 292  916 pause    S    p0 0:00.58 ksh 
  0 7623    1   0   1    1  -5  0 100  548 lfs segl D    p0 0:02.35 /home/wbagger/code/lfstrasher/lfstrasher . 
  0 7624    1  11   1    1  -5  0 100  548 lfs segl D    p0 0:07.13 /home/wbagger/code/lfstrasher/lfstrasher . 
  0 7625    1   2   1    1  -5  0 100  548 lfs segl D    p0 0:07.20 /home/wbagger/code/lfstrasher/lfstrasher . 
  0 7626    1   0   1    1  -5  0 100  548 lfs segl D    p0 0:02.61 /home/wbagger/code/lfstrasher/lfstrasher . 
  0 7649  187   1   1    1  29  0  80  576 -        R    p0 0:00.03 ps -s 
  0  207  182   0   1    1  18  0 256  880 pause    SW   p1 0:00.25 ksh 
  0 7613  207   0   1    1  -5  0 968 1528 pndirop  D    p1 1:12.48 /usr/libexec/lfs_cleanerd -n 4 -d -l 3 /mnt 

>How-To-Repeat:
Perform a lot of dirops while the cleaner is running.
>Fix:
Index: lfs_vnops.c
===================================================================
RCS file: /misc/netbsd/src/sys/ufs/lfs/lfs_vnops.c,v
retrieving revision 1.95
diff -u -r1.95 lfs_vnops.c
--- lfs_vnops.c	8 Mar 2003 21:46:06 -0000	1.95
+++ lfs_vnops.c	11 Mar 2003 06:39:04 -0000
@@ -1093,7 +1093,7 @@
 }
 
 static void
-lfs_flush_dirops(struct lfs *fs)
+lfs_flush_dirops(struct lfs *fs, int flags)
 {
 	struct inode *ip, *nip;
 	struct vnode *vp;
@@ -1125,7 +1125,7 @@
 	 * even though we are leaving out all the file data.
 	 */
 	lfs_imtime(fs);
-	lfs_seglock(fs, SEGM_CKP);
+	lfs_seglock(fs, SEGM_CKP | flags);
 	sp = fs->lfs_sp;
 
 	/*
@@ -1266,8 +1266,8 @@
 		 */
 		fs = VTOI(ap->a_vp)->i_lfs;
 		off = fs->lfs_offset;
-		lfs_seglock(fs, SEGM_FORCE_CKP | SEGM_CKP);
-		lfs_flush_dirops(fs);
+		lfs_flush_dirops(fs, SEGM_FORCE_CKP);
+		lfs_seglock(fs, SEGM_FORCE_CKP | SEGM_CKP); 
 		LFS_CLEANERINFO(cip, fs, bp);
 		oclean = cip->clean;
 		LFS_SYNC_CLEANERINFO(cip, fs, bp, 1);
>Release-Note:
>Audit-Trail:
>Unformatted: