NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-i386/41706: disk subsystem unresponsive after (recovered) disk failure

On Sun, Jul 12, 2009 at 03:05:00PM +0000, wrote:
> >Description:
> sd1 failed on the above system a couple of days ago.  What I could see
> on the console were the messages from ahc1 being reset.  sd1 became
> unready and would no longer respond positivly to a TEST UNIT READY command
> (firmware diagnostic failure given as the reason).
> The system sat there for 2 more days without further kernel messages.
> Pressing return on the console would produce a new login prompt from getty.
> The system was pingable and did accept TCP connections (e.g. to the SSH port).
> But no disk IO would happen and no error messages were printed.
> IOW. the block IO subsystem seems to have been deadlocked at a high level.

This is an issue with timeouts in the ahc driver (I found with a tape drive
where some mt or chio operation would take too long). I have a patch for this
(on a powered down system, I'll have a look tomorow).
from memory, the workaround was to not send BDR message and directly do a
bus reset.

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference

Home | Main Index | Thread Index | Old Index