Subject: kern/2262: CD-ROM deadlock problem fix
To: None <gnats-bugs@NetBSD.ORG>
From: Noriyuki Soda <soda@sra.co.jp>
List: netbsd-bugs
Date: 03/26/1996 06:42:29
>Number:         2262
>Category:       kern
>Synopsis:       CD-ROM deadlock problem fix
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 25 16:50:01 1996
>Last-Modified:
>Originator:     Noriyuki Soda
>Organization:
	Software Research Associates, Inc., Japan
>Release:        1.1B (-current Mar 17, 1996)
>Environment:
System: NetBSD james 1.1B NetBSD 1.1B (PALM) #10: Tue Mar 26 05:37:55 JST 1996 soda@james:/mnt2/current/src/sys/arch/i386/compile/PALM i386

	SCSI:	AHA-2842 (ahe)
	CD-ROM: Matsushita, CD-ROM CR-504

>Description:
	deadlock on multiple simultaneous CD-ROM access.
	same problem as PR#1256 (reported by Tsugutomo Enami).

>How-To-Repeat:
	# mount -t cd9660 -r /dev/cd0a /cdrom
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &
	% find /cdrom -type d -type f &

	% ps agxlw | grep find
  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT TT       TIME COMMAND
 2924   178   154  12 -14  0   188  248 isoigt DW   p0    0:01.69 /cdrom (find)
 2924   179   154  12 -14  0   188  240 isoilk DW   p0    0:01.58 /cdrom (find)
 2924   180   154  14  -5  0   188  240 getblk DW   p0    0:01.60 /cdrom (find)
 2924   181   154  12 -14  0   188  300 isoilk DW   p0    0:01.54 /cdrom (find)
 2924   182   154  10 -14  0   188  436 isoilk DW   p0    0:01.63 /cdrom (find)
 2924   183   154  49 -14  0   188  436 isoigt DW   p0    0:02.07 /cdrom (find)
 2924   184   154  66 -14  0   188  436 isoigt DW   p0    0:01.42 /cdrom (find)
 2924   209   154  25 -14  0   176  392 isoigt DW   p0    0:00.46 /cdrom (find)


	these processes are all deadlocked (no CD-ROM activity)

	note:	patch2 (attached below) is applied, so that all WCHAN is
		displayed as string instead of address.

	DDB shows that one of these processes is locked at 
				cd9660_ihasins()
		called from	cd9660_vget_internal()
		called from	cd9660_lookup() - cd9660_lookup.c line 422

>Fix:

	This deadlock seems to be caused by multiple buffer is locked
	(by B_BUSY) at once, without locking protocol.

	Patch1 fixes this problem by accessing buffer in top to bottom
	order.

	Patch2 only changes WCHAR format of ps, so that patch2 is
	optional.

	  To enami,
	Please test this patch1.

[patch1] fix
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
diff -u sys/isofs/cd9660.org/cd9660_lookup.c sys/isofs/cd9660/cd9660_lookup.c
--- sys/isofs/cd9660.org/cd9660_lookup.c	Sat Feb 10 21:32:05 1996
+++ sys/isofs/cd9660/cd9660_lookup.c	Tue Mar 26 05:37:34 1996
@@ -418,10 +418,11 @@
 	 * it's a relocated directory.
 	 */
 	if (flags & ISDOTDOT) {
+		brelse(bp);		/* race to get the buffer */
 		VOP_UNLOCK(pdp);	/* race to get the inode */
+
 		error = cd9660_vget_internal(vdp->v_mount, dp->i_ino, &tdp,
-					     dp->i_ino != ino, ep);
-		brelse(bp);
+					     dp->i_ino != ino, NULL);
 		if (error) {
 			VOP_LOCK(pdp);
 			return (error);
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----


[patch2] change WCHAN format of ps
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
diff -u sys/isofs/cd9660.org/cd9660_node.c sys/isofs/cd9660/cd9660_node.c
--- sys/isofs/cd9660.org/cd9660_node.c	Sat Feb 10 21:32:05 1996
+++ sys/isofs/cd9660/cd9660_node.c	Sat Mar 23 17:41:38 1996
@@ -161,7 +161,7 @@
 			if (inum == ip->i_number && device == ip->i_dev) {
 				if (ip->i_flag & IN_LOCKED) {
 					ip->i_flag |= IN_WANTED;
-					sleep(ip, PINOD);
+					tsleep(ip, PINOD, "isoigt", 0);
 					break;
 				}
 				vp = ITOV(ip);
diff -u sys/isofs/cd9660.org/cd9660_vnops.c sys/isofs/cd9660/cd9660_vnops.c
--- sys/isofs/cd9660.org/cd9660_vnops.c	Sun Mar 17 21:28:06 1996
+++ sys/isofs/cd9660/cd9660_vnops.c	Sat Mar 23 17:42:45 1996
@@ -790,7 +790,7 @@
 start:
 	while (vp->v_flag & VXLOCK) {
 		vp->v_flag |= VXWANT;
-		sleep((caddr_t)vp, PINOD);
+		tsleep(vp, PINOD, "isovlk", 0);
 	}
 	if (vp->v_tag == VT_NON)
 		return (ENOENT);
@@ -805,7 +805,7 @@
 		} else
 			ip->i_lockwaiter = -1;
 #endif
-		(void) sleep((caddr_t)ip, PINOD);
+		(void) tsleep(ip, PINOD, "isoilk", 0);
 		goto start;
 	}
 #ifdef DIAGNOSTIC
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
--
soda@sra.co.jp		Software Research Associates, Inc., Japan
(Noriyuki Soda)		   software tools and technology group
>Audit-Trail:
>Unformatted: