Subject: wd `lost interrupt' problems
To: None <current-users@sun-lamp.cs.berkeley.edu>
From: Charles Hannum <mycroft@duality.gnu.ai.mit.edu>
List: current-users
Date: 03/03/1994 19:47:25
While my hacking on wd.c has fixed all but one of the problems I know
of on most hardware (the last one being a race condition with
wd[gs]etctlr() and normal I/O), I was still getting `lost interrupt'
messages on one particular machine when doing multisector writes.

It turns out that this is due to semi-flaky hardware.  If I try to
write bytes to the IDE controller too fast, it just occasionally drops
one (actually, two).  By reconfiguring my C&T chipset (using the BIOS
utilities) and adding wait states (specifically, increasing `*-BIT AT
BUS WAIT STATES' to 2, but this will vary depending on the machine), I
was able to completely fix the problem.  Interestingly enough,
single-sector writes don't seem to ever tickle this; the probability
of losing on a given write approaches 100% as the size of the transfer
approaches somewhere around 16k or 32k.

I've also occasionally (like once every three or four months) seen a
strange lossage mode with my NE2000 where if_ed will start spewing
`remote DMA not completed' messages until I reboot.  It's possible
that this problem is related.


------------------------------------------------------------------------------