Subject: Re: PDC sata timeouts.
To: Brian A. Seklecki <bseklecki@collaborativefusion.com>
From: William Fletcher <wfletcher@omina.co.za>
List: netbsd-help
Date: 03/21/2007 13:29:32
--5LNamzE5O+fAevWd
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Oh, uhm, I'm guessing I should have included more dmesg stuff.

Sorry.

raid0: Error re-writing parity!
Warning: truncating spare disk /dev/wd3a to 312581632 blocks (from 31258168=
1)
RECON: initiating reconstruction on col 1 -> spare at col 2
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: error reading fsbn 268435392 of 268435392-268435519 (wd2 bn 268435455=
; cn=20
266305 tn 0 sn 15), retrying
wd2: (id not found)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15)
raid0: Recon read failed!
raid0: reconstruction failed.
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 8192 tc_skip: 0
raidlookup on device: /dev/wd3a failed!
RECON: initiating reconstruction on col 1 -> spare at col 2
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 6139840 of 6139840-6139967 (wd2 bn 613990=
3; cn
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 8192 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D8192, c_skip0
wd2a: device timeout writing fsbn 290709376 of 290709376-290709391 (wd2 bn =
29070
9439; cn 288402 tn 3 sn 34), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 8192 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 14615488 of 14615488-14615615 (wd2 bn 146=
15551
; cn 14499 tn 8 sn 55), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 21262272 of 21262272-21262399 (wd2 bn 212=
62335
; cn 21093 tn 9 sn 24), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 26698048 of 26698048-26698175 (wd2 bn 266=
98111
; cn 26486 tn 3 sn 34), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 87686464 of 87686464-87686591 (wd2 bn 876=
86527
; cn 86990 tn 9 sn 40), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: error reading fsbn 268435392 of 268435392-268435519 (wd2 bn 268435455=
; cn=20
266305 tn 0 sn 15), retrying
wd2: (id not found)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15)
raid0: Recon read failed!
raid0: reconstruction failed.
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 8192 tc_skip: 0
RECON: initiating reconstruction on col 1 -> spare at col 2
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
; cn 59069 tn 5 sn 20), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 4096 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D4096, c_skip0
wd2a: device timeout writing fsbn 290712256 of 290712256-290712263 (wd2 bn =
29071
2319; cn 288405 tn 1 sn 16), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 4096 tc_skip: 0
wd2: soft error (corrected)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: error reading fsbn 268435392 of 268435392-268435519 (wd2 bn 268435455=
; cn=20
266305 tn 0 sn 15), retrying
wd2: (id not found)
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15), retrying
pdcsata0:0:0: lost interrupt
        type: ata tc_bcount: 65536 tc_skip: 0
wd2a: device timeout reading fsbn 268435392 of 268435392-268435519 (wd2 bn =
26843
5455; cn 266305 tn 0 sn 15)
raid0: Recon read failed!
raid0: reconstruction failed.
pdcsata0:0:0: lost interrupt

On Tue, Mar 20, 2007 at 11:29:43PM -0400, Brian A. Seklecki wrote:
> If you've eliminated the cable, then that leaves the drives, the
> controller/MB, and driver.
>=20
> Try running PC-Doctor and doing a low-level sector scan for grown
> defects?
>=20
> Send us your full dmesg(8)?  Try an intel SATA controller?=20
>=20
> ~BAS
>=20
> On Wed, 2007-03-21 at 02:30 +0200, William Fletcher wrote:
> > Hi list,
> >=20
> > I've been trying to build a raid(frame) onto two 160G Hitachi hard driv=
es,
> > but always end up with "lost interrupt" messages.
> >=20
> > I'm using the pdcsata driver, with a "Promise PDC20775 SATA300 controll=
er".
> >=20
> > The following are some of the messages displayed in the dmesg:
> > pdcsata0:0:0: device timeout, c_bcount=3D65536, c_skip0
> > wd2a: device timeout reading fsbn 26698048 of 26698048-26698175 (wd2 bn=
 26698111; cn 26486 tn 3 sn 34), retrying
> > pdcsata0:0:0: lost interrupt
> >         type: ata tc_bcount: 65536 tc_skip: 0
> >=20
> > The machine in question is running NetBSD 3.1, with a fresh installatio=
n.
> >=20
> > I've tried two identical motherboards, and two identical PDC20775 contr=
ollers,
> > different memory and different CPUs, Different SATA cables and differen=
t drives (300G).
> >=20
> > I'm pretty sure the hardware isn't the problem.
> >=20
> > This problem doesn't seem to occur on machines with only 80 gig hard dr=
ives,
> > it appears to be tied in with the bigger 160 gigs, since I have two mac=
hines
> > in production with 80 gig Hitachi drives and identical PDC20775 control=
lers,
> > I reconstructed the RAID on one to check, and used its motherboard with=
 the
> > 160Gs to see if I was perhaps losing my mind.
> >=20
> > Please help, the voices in my head are growing louder, I don't know how=
 long
> > I can contain them anymore.
> >=20
> > Thank you all very much in advance.
> >=20
> --=20
> Brian A. Seklecki <bseklecki@collaborativefusion.com>
> Collaborative Fusion, Inc.
>=20
>=20
>=20
>=20
> IMPORTANT: This message contains confidential information and is intended=
 only for the individual named. If the reader of this message is not an int=
ended recipient (or the individual responsible for the delivery of this mes=
sage to an intended recipient), please be advised that any re-use, dissemin=
ation, distribution or copying of this message is prohibited.  Please notif=
y the sender immediately by e-mail if you have received this e-mail by mist=
ake and delete this e-mail from your system.
>=20
>=20

--=20
Omina Solutions  | http://omina.co.za | (012) Ph. 664-2480 F. 664-2474=20


--5LNamzE5O+fAevWd
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (NetBSD)

iD8DBQFGARcc0o1hk/SHCkoRAp4gAKC5higiyhg9dyvzWUEs+aU8Abzc5ACgtmG/
Lt/v4jNDhOiVX9k7Y2iCQyU=
=SdcC
-----END PGP SIGNATURE-----

--5LNamzE5O+fAevWd--