Port-vax archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Race in MSCP (ra/rx) driver



>> Not entirely true. Many controllers were designed long after the
>> processors they were connected to, and made use of way more modern
>> hardware that was much faster.  [...]
> They might be faster on a per instruction basis, BUT none would be so
> fast that they could receive a command, probe some arbitrary amount
> of memory and produce useful results (usually back in memory) before
> the system CPU started executing the next instruction.

Not "some arbitrary amount of memory".  The amount of memory accessed
by the device - at least in the races I dealt with - was fixed, and
quite small.  In the case of the bootblocks' race, the first DMA cycle
by the device was all it took to break things - and that *could* well
arrive before the next instruction runs, especially if the setup
sequencing is done in a gate array or some such instead of firmware.

Also, "before the [host] started executing the next instruction" is not
what matters.  "Before the host finishes preparing for completion" is
what matters, and, in each of the case I fixed, that took well over one
instruction.  In the bootblock case, I think it was something like
8-to-10 instructions; in the kernel case, I don't know - it was
something like five instructions before it entered tsleep, and I don't
know how long to took tsleep to get to the point where the device could
interrupt without breaking anything.  Looking at the source, I'd
estimate at least another five instructions (and one of those 10 or so
instructions was a CALLS, which is notoriously slow).

None of which strikes me as very relevant.  A race is still a race and
arguably should be fixed, especially when it's as simple to fix as the
ones I ran into were, even if the host won't ever lose the race under
normal conditions.  In addition to all the things already raised,
consider someone single-stepping in kgdb or ddb.

> All true, but none could possible perform all the necessary steps to
> interpret->process->saveresults->interrupt BEFORE the system CPU
> started to interpret the next instruction.

Even when true, does that excuse the code depending on it for
correctness, especially when it's so simple to fix?  I think not.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index