Subject: problem with wd.c
To: None <tech-kern@NetBSD.ORG>
From: Brett Lymn <blymn@awadi.com.AU>
List: tech-kern
Date: 08/13/1995 22:45:07
Folks,
        Whilst doing something really perverse I think I came across a
possible problem in wd.c.  I have a challenge to lift some data off an
ESDI disk that has had some sectors scrunged when a rodent decided the
contoller card was a good place to pee - the people who want the data
will _give_ me the computer and all this bits if I can get the data
off this disk.  OK, fine I can cope with the challenge (I think ;-)

I whacked the disk into my machine as the second disk and got the
kernel talking to it.  I tried dd'ing off the disk which worked for a
while but when it got to one of the dud sectors I got the follwing
error message repeatedly:

lost interrupt status 59<rdy,seekdone,drq,err> error 10<no_id>

I was not surprised about the message but my machine never came back -
it was stuck in a loop continuously retrying the read on the sector.
After a bit of probing it looks like the sc_errors was just
incremented to 1 and got no further.  I added some code that counted
the number of bad interrupts and kicked out a EIO when it exceeded a
limit and my machine came back to life.  I think it would be valuable
to have this in the driver..

Whilst on the wd driver, where are the numbers for the attach
parameters of the drive coming from?  The numbers I am seeing from the
attach do not agree with the numbers given to me by the BIOS format
program, it looks like the wd driver is out by a sector or two on this
particular drive.  I am not sure if the controller is lying or what
but I need to be sure as the data on the disk is a foreign file system
(an Interactive Unix FS) that I want to peel this data off, reformat
the drive and then stick the data back on - hopefully Interactive's
fsck will then be able to fix the fs damage done by the rodent, so I
have to be certain I get all the sectors in the right places.

Also, for a long time I have been having problems (unrelated to my
challenge...) with NMI's occuring during intensive use of the SCSI
bus.  I have fiddled my aha1542.c to change the bus on and off times
to the opposite of what they were and this seems to have improved the
situation immeasurably.  What I would like to know is how far can I go
with these numbers?  I know my scsi performance will probably suffer
but NMI's have a far more dramatic impact!

For those who have not guessed this is all on a 486DX running NetBSD 1.0

-- 
Brett Lymn, Computer Systems Administrator, AWA Defence Industries
===============================================================================
"It's fifteen hundred miles to Ankh-Morpork" he said.  "We've got
three hundred and sixty three elephants, fifty carts of forage, the
monsoon's about to break and we're wearing ... we're wearing ... sort
of things, like glass, only dark... dark glass things on our eyes..."
        - Terry Pratchett "Moving Pictures".