Subject: isp bus reset gives disk corruption
To: None <port-alpha@netbsd.org, current-users@netbsd.org>
From: Olaf Seibert <rhialto@polderland.nl>
List: port-alpha
Date: 04/20/2001 02:18:46
While using my cd burner, the scsi bus was reset. This is with
NetBSD/alpha 1.5.

Excerpt from boot messages:

isp0 at pci1 dev 8 function 0
isp0: interrupting at dec 550 irq 12
isp0: Ultra Mode Capable
isp0: Board Revision 1040B, loaded F/W Revision 4.65.0
isp0: Last F/W revision was 5.57.1
isp0: 243 max I/O commands supported
isp0: driver initiated bus reset of bus 0
scsibus0 at isp0: 16 targets, 8 luns per target
scsibus0: waiting 2 seconds for devices to settle...
isp0: Bus 0 Target 0 Async Mode 
sd0 at scsibus0 target 0 lun 0: <DEC, RZ1CC-BA (C) DEC, 883F> SCSI2 0/direct fix
ed
isp0: Bus 0 Target 0 at 20MHz Max Offset 8, 16 bit wide, Tagged Queueing Enabled
...more of these...
sd0: 4091 MB, 3708 cyl, 20 head, 113 sec, 512 bytes/sect x 8380080 sectors
isp0: Bus 0 Target 0 at 20MHz Max Offset 8, 16 bit wide, Tagged Queueing Enabled
cd1 at scsibus0 target 4 lun 0: <YAMAHA, CRW4416S, 1.0h> SCSI2 5/cdrom removable
isp0: Bus 0 Target 4 Async Mode, 16 bit wide, Tagged Queueing Enabled

From syslog:

Apr 19 22:15:28 azenomei /netbsd: isp0: bus reset destroyed command for 0.4.0
Apr 19 22:15:28 azenomei /netbsd: cd1(isp0:4:0): unknown error category 8 from h
ost adapter code
Apr 19 22:15:29 azenomei /netbsd: isp0: timeout initiated SCSI bus reset of bus 
0
Apr 19 22:15:29 azenomei /netbsd: 
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x68 in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x69 in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x6a in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x6b in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x6c in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x6d in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x6e in xflist
Apr 19 22:15:29 azenomei /netbsd: isp0: cannot find handle 0x6f in xflist
Apr 19 22:15:30 azenomei /netbsd: isp0: bus reset destroyed command for 0.0.0
Apr 19 22:15:30 azenomei last message repeated 3 times
Apr 19 22:15:30 azenomei /netbsd: isp0: Bus 0 Target 0 at 20MHz Max Offset 8, 16
 bit wide, Tagged Queueing Enabled

Then, I had a whole bunch of corrupted inodes, basically a nicely
consecutive range, which I discovered some 15 minutes later. This looks
like the sort of problem that made the machine just hang with 1.4.x.

My guess is that the 4 times of "bus reset destroyed command for 0.0.0"
caused some garbage blocks to appear in the buffer cache. I certainly was
not writing to any of the corrupted files (most of which were in
/usr/pkg/bin or were the man pages associated with same (consecutive
inodes because they were installed together)). When they were synced (I
had been trying to access at least some of the files in the mean time)
the corruption hit the disk, so to speak.

fsck cleared and adjusted a whole bunch of files, but not all. It is not
possible to remove those with rm (at least not at securelevel 1, maybe
at 0 it works) (so I moved the other files in the directory instead).

Some of them look like this now:

248839 c---r-sr--  25970 1661536817 1902473325  804, 201516 Mar 23 1989 #248839
248842 br-xr-sr--   2670 151664953  1902404705 1056, 197228 Mar 17 1989 #248842
248865 c---r-S---  12595 858007604  151660848   878, 448114 Nov 25 1991 #248865
248867 c--Srws--x  10288 909190176  1129063468  627, 132970 May 11 2031 #248867
248871 c--SrwS---  1     606105972  925969456   356, 132972 Sep  7 2028 cdda2wav
248880 c---r-S---  1     544302180  741749028  2097,  41764 May  6 1999 mpack
248892 br-xr-S-w-  1     824451185  842148918  2616,  36913 Jun  3 1975 tomac
248884 c--SrwSrw-  1     1644759610 606106473  2314, 442409 May 23 2023 unrar
248872 b--xr-sr--  1     1936286217 741352480   817, 222028 Jun 29 1993 zip

Although I do not have a good backup, the files were mostly easy to
re-generate, fortunately. As far as I discovered them...

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert - rhialto@polder --Soep van de dag, wat zal dat zijn
\X/ land.nl     --wat kan dat wezen, beter maar het ergste vrezen -Boy Bensdorp