Subject: kern/2262: CD-ROM deadlock problem fix
To: None <gnats-bugs@NetBSD.ORG>
From: Noriyuki Soda <soda@sra.co.jp>
List: netbsd-bugs
Date: 03/26/1996 06:42:29
>Number: 2262
>Category: kern
>Synopsis: CD-ROM deadlock problem fix
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people (Kernel Bug People)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 25 16:50:01 1996
>Last-Modified:
>Originator: Noriyuki Soda
>Organization:
Software Research Associates, Inc., Japan
>Release: 1.1B (-current Mar 17, 1996)
>Environment:
System: NetBSD james 1.1B NetBSD 1.1B (PALM) #10: Tue Mar 26 05:37:55 JST 1996 soda@james:/mnt2/current/src/sys/arch/i386/compile/PALM i386
SCSI: AHA-2842 (ahe)
CD-ROM: Matsushita, CD-ROM CR-504
>Description:
deadlock on multiple simultaneous CD-ROM access.
same problem as PR#1256 (reported by Tsugutomo Enami).
>How-To-Repeat:
# mount -t cd9660 -r /dev/cd0a /cdrom
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% find /cdrom -type d -type f &
% ps agxlw | grep find
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
2924 178 154 12 -14 0 188 248 isoigt DW p0 0:01.69 /cdrom (find)
2924 179 154 12 -14 0 188 240 isoilk DW p0 0:01.58 /cdrom (find)
2924 180 154 14 -5 0 188 240 getblk DW p0 0:01.60 /cdrom (find)
2924 181 154 12 -14 0 188 300 isoilk DW p0 0:01.54 /cdrom (find)
2924 182 154 10 -14 0 188 436 isoilk DW p0 0:01.63 /cdrom (find)
2924 183 154 49 -14 0 188 436 isoigt DW p0 0:02.07 /cdrom (find)
2924 184 154 66 -14 0 188 436 isoigt DW p0 0:01.42 /cdrom (find)
2924 209 154 25 -14 0 176 392 isoigt DW p0 0:00.46 /cdrom (find)
these processes are all deadlocked (no CD-ROM activity)
note: patch2 (attached below) is applied, so that all WCHAN is
displayed as string instead of address.
DDB shows that one of these processes is locked at
cd9660_ihasins()
called from cd9660_vget_internal()
called from cd9660_lookup() - cd9660_lookup.c line 422
>Fix:
This deadlock seems to be caused by multiple buffer is locked
(by B_BUSY) at once, without locking protocol.
Patch1 fixes this problem by accessing buffer in top to bottom
order.
Patch2 only changes WCHAR format of ps, so that patch2 is
optional.
To enami,
Please test this patch1.
[patch1] fix
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
diff -u sys/isofs/cd9660.org/cd9660_lookup.c sys/isofs/cd9660/cd9660_lookup.c
--- sys/isofs/cd9660.org/cd9660_lookup.c Sat Feb 10 21:32:05 1996
+++ sys/isofs/cd9660/cd9660_lookup.c Tue Mar 26 05:37:34 1996
@@ -418,10 +418,11 @@
* it's a relocated directory.
*/
if (flags & ISDOTDOT) {
+ brelse(bp); /* race to get the buffer */
VOP_UNLOCK(pdp); /* race to get the inode */
+
error = cd9660_vget_internal(vdp->v_mount, dp->i_ino, &tdp,
- dp->i_ino != ino, ep);
- brelse(bp);
+ dp->i_ino != ino, NULL);
if (error) {
VOP_LOCK(pdp);
return (error);
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
[patch2] change WCHAN format of ps
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
diff -u sys/isofs/cd9660.org/cd9660_node.c sys/isofs/cd9660/cd9660_node.c
--- sys/isofs/cd9660.org/cd9660_node.c Sat Feb 10 21:32:05 1996
+++ sys/isofs/cd9660/cd9660_node.c Sat Mar 23 17:41:38 1996
@@ -161,7 +161,7 @@
if (inum == ip->i_number && device == ip->i_dev) {
if (ip->i_flag & IN_LOCKED) {
ip->i_flag |= IN_WANTED;
- sleep(ip, PINOD);
+ tsleep(ip, PINOD, "isoigt", 0);
break;
}
vp = ITOV(ip);
diff -u sys/isofs/cd9660.org/cd9660_vnops.c sys/isofs/cd9660/cd9660_vnops.c
--- sys/isofs/cd9660.org/cd9660_vnops.c Sun Mar 17 21:28:06 1996
+++ sys/isofs/cd9660/cd9660_vnops.c Sat Mar 23 17:42:45 1996
@@ -790,7 +790,7 @@
start:
while (vp->v_flag & VXLOCK) {
vp->v_flag |= VXWANT;
- sleep((caddr_t)vp, PINOD);
+ tsleep(vp, PINOD, "isovlk", 0);
}
if (vp->v_tag == VT_NON)
return (ENOENT);
@@ -805,7 +805,7 @@
} else
ip->i_lockwaiter = -1;
#endif
- (void) sleep((caddr_t)ip, PINOD);
+ (void) tsleep(ip, PINOD, "isoilk", 0);
goto start;
}
#ifdef DIAGNOSTIC
---- cut here ---- cut here ---- cut here ---- cut here ---- cut here ----
--
soda@sra.co.jp Software Research Associates, Inc., Japan
(Noriyuki Soda) software tools and technology group
>Audit-Trail:
>Unformatted: