Subject: port-i386/3773: locking bug in wd.c
To: None <gnats-bugs@gnats.netbsd.org>
From: None <onno@simplex.nl>
List: netbsd-bugs
Date: 06/22/1997 21:58:32
>Number: 3773
>Category: port-i386
>Synopsis: locking bug in wd.c
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: gnats-admin (GNATS administrator)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jun 22 13:05:01 1997
>Last-Modified:
>Originator: Onno van der Linden
>Organization:
>Release: June 22nd <NetBSD-current source date>
>Environment:
System: NetBSD sheep 1.2G NetBSD 1.2G (SHEEP) #17: Sun Jun 22 20:18:15 MEST 1997 onno@sheep:/usr/src/sys/arch/i386/compile/SHEEP i386
>Description:
There's a bug regarding the locking mechanism in wd_get_parms() from
/sys/dev/isa/wd.c which makes it possible to crash the startup of NetBSD/i386.
You'll only notice this if you have TWO drives at ONE controller. Besides that
this bug is very timing dependent, so not everyone will have a crash at startup.
The wakeup() statement in wdcstart() can happen in the middle of a transfer
((wdc->sc_flags & WDCF_ACTIVE) == 0 && wd->sc_skip != 0) to the `other' drive
at the wd controller.
NOTE: this may effect the arm32 port and other ports that use the wd driver.
>How-To-Repeat:
On my system
1. Boot single user mode
2. Hit ENTER for sh
3. fsck -p
Results in the following output of ddb
kernel: integer divide overflow trap, code = 0
stopped at _readdisklabel+0xae divl 0x38(%ecx), %eax
Also possible is a normal startup with the wd driver saying
wd0: wdcintr read intr before drq status 0 error 0
>Fix:
The *real* fix is to implement what Charles wrote in the wd_get_parms()
comment. For the moment the fix below will do. It will only wakeup()
the wd_get_parms() request if the controller that the drive is attached
to has *nothing* (i.e. request queue is empty) to do. Even with this patch
there's still a race between wdcstart() and wd_get_parms() to be the first
to enter a command. If wdcstart() is able to get there first and issue
a multi sector read command, we're still in trouble ......
*** /sys/dev/isa/wd.c.old Sun Jun 22 13:53:45 1997
--- /sys/dev/isa/wd.c Sun Jun 22 20:17:19 1997
***************
*** 538,556 ****
#endif
/*
* XXX
* This is a kluge. See comments in wd_get_parms().
*/
! if ((wdc->sc_flags & WDCF_WANTED) != 0) {
! wdc->sc_flags &= ~WDCF_WANTED;
! wakeup(wdc);
return;
}
-
- loop:
- /* Is there a drive for the controller to do a transfer with? */
- wd = wdc->sc_drives.tqh_first;
- if (wd == NULL)
- return;
/* Is there a transfer to this drive? If not, deactivate drive. */
--- 538,556 ----
#endif
+
+ loop:
+ /* Is there a drive for the controller to do a transfer with? */
+ wd = wdc->sc_drives.tqh_first;
+ if (wd == NULL) {
/*
* XXX
* This is a kluge. See comments in wd_get_parms().
*/
! if ((wdc->sc_flags & WDCF_WANTED) != 0) {
! wdc->sc_flags &= ~WDCF_WANTED;
! wakeup(wdc);
! }
return;
}
/* Is there a transfer to this drive? If not, deactivate drive. */
>Audit-Trail:
>Unformatted: