Subject: IDE driver misfeature?
To: Tech-kern <tech-kern@NetBSD.ORG>
From: Jukka Marin <jmarin@embedtronics.fi>
List: tech-kern
Date: 08/04/2004 08:51:09
Hello,

Running 2.0beta:

Is it a feature of the new IDE driver that a failing disk can lock up
the whole computer?  It feels like the spl level stays very high during
a disk operation and if a drive is having problems reading a block (for
example), the computer is (almost) dead until the command is completed.

I had complete freezes on my desktop machine recently.  Twice the system
recovered after a wait of tens of seconds.  Once I had to power-cycle
the system to bring the disk back alive (even bios blocked when trying
to identify it (a seagate barracuda)).

On a laptop, when the disk and driver were retrying reads, the machine
didn't even respong to ping.  Instead, all the ping packets were
received by the laptop - and when the IDE operation was completed, the
machine sent reply packets to all pings at once.  So it seems network
reception works during the IDE operations, but the network stack or
mbuf system or the transmit side are blocked.

I don't think I've ever seen this under 1.6 - a failing disk would
prevent or slow down accesses to itself, but it didn't bring the
whole system to halt.  Maybe something happened to the driver when
it was split into different chipset drivers?

  -jm