Subject: satalink trouble: lost interrupt
To: None <current-users@NetBSD.org>
From: Thomas Klausner <wiz@NetBSD.org>
List: current-users
Date: 08/14/2005 13:33:53
Hi!

Today, I had trouble with satalink(4). After a "missing interrupt",
a drive stayed inaccessible until a reboot, after which it's
working fine again (so far).

Any ideas if this is rather software or hardware trouble?

Other hard disks continued working.

/var/log/messages output related to satalink, wd2, and raid:

Aug 14 09:01:47 hiro /netbsd: satalink0 at pci0 dev 14 function 0
Aug 14 09:01:47 hiro /netbsd: satalink0: Silicon Image SATALink 3114 (rev. 0x02)
Aug 14 09:01:47 hiro /netbsd: satalink0: 33MHz PCI bus
Aug 14 09:01:47 hiro /netbsd: satalink0: bus-master DMA support present
Aug 14 09:01:47 hiro /netbsd: satalink0: using irq 10 for native-PCI interrupt
Aug 14 09:01:47 hiro /netbsd: atabus0 at satalink0 channel 0
Aug 14 09:01:47 hiro /netbsd: atabus1 at satalink0 channel 1
Aug 14 09:01:47 hiro /netbsd: atabus2 at satalink0 channel 2
Aug 14 09:01:47 hiro /netbsd: atabus3 at satalink0 channel 3
Aug 14 09:01:47 hiro /netbsd: satalink0: port 0: device present, speed: 1.5Gb/s
Aug 14 09:01:47 hiro /netbsd: wd0 at atabus0 drive 0satalink0: port 1: device present, speed: 1.5Gb/s
Aug 14 09:01:47 hiro /netbsd: satalink0: port 2: device present, speed: 1.5Gb/s
Aug 14 09:01:47 hiro /netbsd: wd0(satalink0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
Aug 14 09:01:47 hiro /netbsd: wd1(satalink0:1:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)
Aug 14 09:01:47 hiro /netbsd: wd2 at atabus2 drive 0: <SAMSUNG SP1614C>
Aug 14 09:01:47 hiro /netbsd: wd2: drive supports 16-sector PIO transfers, LBA48 addressing
Aug 14 09:01:47 hiro /netbsd: wd2: 149 GB, 310101 cyl, 16 head, 63 sec, 512 bytes/sect x 312581808 sectors
Aug 14 09:01:47 hiro /netbsd: wd2: 32-bit data port
Aug 14 09:01:47 hiro /netbsd: wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 7
Aug 14 09:01:47 hiro /netbsd: wd2(satalink0:2:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)
Aug 14 09:01:47 hiro /netbsd: raid0: RAID Level 5
Aug 14 09:01:47 hiro /netbsd: raid0: Components: /dev/wd2e /dev/wd0e /dev/wd1h
Aug 14 09:01:47 hiro /netbsd: raid0: Total Sectors: 8353536 (4078 MB)
Aug 14 12:50:39 hiro /netbsd: satalink0:2:0: lost interrupt
Aug 14 12:50:39 hiro /netbsd: type: ata tc_bcount: 2048 tc_skip: 0
Aug 14 12:50:39 hiro /netbsd: satalink0:2:0: bus-master DMA error: missing interrupt, status=0x21
Aug 14 12:50:39 hiro /netbsd: satalink0:2:0: device timeout, c_bcount=2048, c_skip0
Aug 14 12:50:39 hiro /netbsd: wd2e: device timeout reading fsbn 80 of 80-83 (wd2 bn 244854108; cn 242910 tn 13 sn 9), retrying
Aug 14 12:51:10 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:51:20 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:51:20 hiro /netbsd: wd2e: device timeout reading fsbn 80 of 80-83 (wd2 bn 244854108; cn 242910 tn 13 sn 9), retrying
Aug 14 12:51:51 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:52:01 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:52:01 hiro /netbsd: wd2e: device timeout reading fsbn 80 of 80-83 (wd2 bn 244854108; cn 242910 tn 13 sn 9), retrying
Aug 14 12:52:32 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:52:42 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:52:42 hiro /netbsd: wd2e: device timeout reading fsbn 80 of 80-83 (wd2 bn 244854108; cn 242910 tn 13 sn 9), retrying
Aug 14 12:53:13 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:53:23 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:53:23 hiro /netbsd: wd2e: device timeout reading fsbn 80 of 80-83 (wd2 bn 244854108; cn 242910 tn 13 sn 9), retrying
Aug 14 12:53:54 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:54:04 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:54:04 hiro /netbsd: wd2e: device timeout reading fsbn 80 of 80-83 (wd2 bn 244854108; cn 242910 tn 13 sn 9)
Aug 14 12:54:04 hiro /netbsd: raid0: IO Error.  Marking /dev/wd2e as failed.
Aug 14 12:54:35 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:54:45 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:54:45 hiro /netbsd: wd2e: device timeout reading fsbn 3149936 of 3149936-3149937 (wd2 bn 248003964; cn 246035 tn 10 sn 54), retrying
Aug 14 12:55:16 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:55:26 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:55:26 hiro /netbsd: wd2e: device timeout reading fsbn 3149936 of 3149936-3149937 (wd2 bn 248003964; cn 246035 tn 10 sn 54), retrying
Aug 14 12:55:57 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:56:07 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:56:07 hiro /netbsd: wd2e: device timeout reading fsbn 3149936 of 3149936-3149937 (wd2 bn 248003964; cn 246035 tn 10 sn 54), retrying
Aug 14 12:56:38 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:56:48 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:56:48 hiro /netbsd: wd2e: device timeout reading fsbn 3149936 of 3149936-3149937 (wd2 bn 248003964; cn 246035 tn 10 sn 54), retrying
Aug 14 12:57:19 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:57:29 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:57:29 hiro /netbsd: wd2e: device timeout reading fsbn 3149936 of 3149936-3149937 (wd2 bn 248003964; cn 246035 tn 10 sn 54), retrying
Aug 14 12:58:00 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:58:10 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:58:10 hiro /netbsd: wd2e: device timeout reading fsbn 3149936 of 3149936-3149937 (wd2 bn 248003964; cn 246035 tn 10 sn 54)
Aug 14 12:58:41 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:58:51 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:58:51 hiro /netbsd: wd2e: device timeout reading fsbn 3151232 of 3151232-3151279 (wd2 bn 248005260; cn 246036 tn 15 sn 27), retrying
Aug 14 12:59:22 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 12:59:32 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 12:59:32 hiro /netbsd: wd2e: device timeout reading fsbn 3151232 of 3151232-3151279 (wd2 bn 248005260; cn 246036 tn 15 sn 27), retrying
Aug 14 13:00:03 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:00:13 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:00:13 hiro /netbsd: wd2e: device timeout reading fsbn 3151232 of 3151232-3151279 (wd2 bn 248005260; cn 246036 tn 15 sn 27), retrying
Aug 14 13:00:44 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:00:54 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:00:54 hiro /netbsd: wd2e: device timeout reading fsbn 3151232 of 3151232-3151279 (wd2 bn 248005260; cn 246036 tn 15 sn 27), retrying
Aug 14 13:01:25 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:01:35 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:01:35 hiro /netbsd: wd2e: device timeout reading fsbn 3151232 of 3151232-3151279 (wd2 bn 248005260; cn 246036 tn 15 sn 27), retrying
Aug 14 13:02:06 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:02:16 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:02:16 hiro /netbsd: wd2e: device timeout reading fsbn 3151232 of 3151232-3151279 (wd2 bn 248005260; cn 246036 tn 15 sn 27)
Aug 14 13:02:47 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:02:57 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:02:57 hiro /netbsd: wd2e: device timeout reading fsbn 3850432 of 3850432-3850559 (wd2 bn 248704460; cn 246730 tn 9 sn 53), retrying
Aug 14 13:03:28 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:03:38 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:03:38 hiro /netbsd: wd2e: device timeout reading fsbn 3850432 of 3850432-3850559 (wd2 bn 248704460; cn 246730 tn 9 sn 53), retrying
Aug 14 13:16:39 hiro /netbsd: satalink0 channel 2: reset failed for drive 0
Aug 14 13:16:39 hiro /netbsd: satalink0:2:0: wait timed out
Aug 14 13:16:39 hiro /netbsd: wd2e: device timeout reading fsbn 3850432 of 3850432-3850559 (wd2 bn 248704460; cn 246730 tn 9 sn 53), retrying
Aug 14 13:16:39 hiro /netbsd: wd2: flush cache command didn't complete

 Thomas