Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Race in MSCP (ra/rx) driver



In sys/dev/mscp/mscp_disk.c, I see (in both 1.4T and cvsweb's most
current version)

rx_putonline(struct rx_softc *rx)
{
...
        /* Poll away */
        bus_space_read_2(mi->mi_iot, mi->mi_iph, 0);
        if (tsleep(&rx->ra_state, PRIBIO, "rxonline", 100*100))
                rx->ra_state = DK_CLOSED;

In 1.4T, this code runs at IPL 0.  Unless it is called at elevated IPL
in -current, there is a race: if the operation completes and interrupts
before tsleep puts the thread to sleep, it will lose - it will sleep
for 10000 ticks and then fail.  Presumably most real hardware isn't
that fast, but something like an MSCP interface backed by RAM or, in my
case, a simulation, can trip this race.  (simh has comments saying, to
rephrase from memory, that VMS works with an infinitely fast MSCP disk,
but the BSDs don't - this is likely part of what's behind the latter.)

Here's the fix I'm using, in case anyone wants to pick it up (and
preferably figure out the right spl*() call - or, better, redo it in an
MP-safe way, in case the interrupt is fielded by a different CPU from
the one running this code):

diff --git a/sys/dev/mscp/mscp_disk.c b/sys/dev/mscp/mscp_disk.c
index ea16b1c..f47e777 100644
--- a/sys/dev/mscp/mscp_disk.c
+++ b/sys/dev/mscp/mscp_disk.c
@@ -572,6 +572,7 @@ rx_putonline(rx)
        struct  mscp *mp;
        struct  mscp_softc *mi = (struct mscp_softc *)rx->ra_dev.dv_parent;
        volatile int i;
+ int s;
 
        rx->ra_state = DK_CLOSED;
        mp = mscp_getcp(mi, MSCP_WAIT);
@@ -580,10 +581,14 @@ rx_putonline(rx)
        mp->mscp_cmdref = 1;
        *mp->mscp_addr |= MSCP_OWN | MSCP_INT;
 
+ // Must block interrupts from bus_space_read_2 until asleep.
+ // Don't know how to be sure what spl is enough, so....
+ s = splhigh();
        /* Poll away */
        i = bus_space_read_2(mi->mi_iot, mi->mi_iph, 0);
        if (tsleep(&rx->ra_dev.dv_unit, PRIBIO, "rxonline", 100*100))
                rx->ra_state = DK_CLOSED;
+ splx(s);
 
        if (rx->ra_state == DK_CLOSED)
                return MSCP_FAILED;

With this, and a closely related fix in the boot blocks (I checked
cvsweb for that, but the code has been restructured enough I'm not sure
whether -current has an analogous issue), NetBSD works on my
(effectively infinitely fast) MSCP simulation.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index