Subject: UltraDMA problems under 1.6F (probably hardware)
To: None <current-users@netbsd.org>
From: gabriel rosenkoetter <gr@eclipsed.net>
List: current-users
Date: 08/13/2002 18:20:52
--A6N2fC+uXW/VQSAv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Having fried the perfectly good (well, mostly good; it *was* a Via
chipset...) motherboard in my main workstation at home in a macho
"don't need a grounding strap to add a PCI card" ESD moment, I
replaced the motherboard and managed to get things back up and
running with one bit of irritation.

On boot, I see this spew:

pciide0:0:0: lost interrupt
        type: ata tc_bcount: 512 tc_skip: 0
pciide0:0:0: bus-master DMA error: missing interrupt, status=3D0x61
wd0: transfer error, downgrading to Ultra-DMA mode 2
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA =
data transfers)
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA =
data transfers)
wd0d: DMA error reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
pciide0:0:0: lost interrupt
        type: ata tc_bcount: 512 tc_skip: 0
pciide0:0:0: bus-master DMA error: missing interrupt, status=3D0x61
wd0: transfer error, downgrading to Ultra-DMA mode 1
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA data transf=
ers)
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA =
data transfers)
wd0d: DMA error reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
pciide0:0:0: lost interrupt
        type: ata tc_bcount: 512 tc_skip: 0
pciide0:0:0: bus-master DMA error: missing interrupt, status=3D0x61
wd0: transfer error, downgrading to DMA mode 2
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA =
data transfers)
wd0d: DMA error reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
pciide0:0:0: lost interrupt
        type: ata tc_bcount: 512 tc_skip: 0
pciide0:0:0: bus-master DMA error: missing interrupt, status=3D0x61
wd0: transfer error, downgrading to PIO mode 4
wd0(pciide0:0:0): using PIO mode 4
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA =
data transfers)
wd0d: DMA error reading fsbn 0 (wd0 bn 0; cn 0 tn 0 sn 0), retrying
wd0: soft error (corrected)
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs
[...]

The relevant attach messages:

pciide0 at pci0 dev 4 function 0: Acer Labs M5229 UDMA IDE Controller (rev.=
 0xc4)
pciide0: bus-master DMA support present
pciide0: primary channel configured to compatibility mode
wd0 at pciide0 channel 0 drive 0: <Maxtor 4G160J8>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 152 GB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 320173056 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6
wd1 at pciide0 channel 0 drive 1: <WDC AC310200R>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 9787 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 20044080 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA=
 data transfers)
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA =
data transfers)
pciide0: secondary channel configured to compatibility mode
pciide0: disabling secondary channel (no drives)

In addition to all this, when the system's actually running, any
kind of heavy I/O involving /dev/wd0 makes X extremely jumpy and,
once, managed to hang the system entirely (as in, it was unreachable
over the network).

I'm not totally convinced that's directly related to the attach
problems... but the jumpiness is definitely related to the partition
on that disk. That partition is formatted with LFS (despite threats
about it being unstable in -current; this is a disk which I use a
lot but which houses data completely recreateable from my CD
collection, so it's a good test point) and with LFS in a state
of flux and UBC-ification, its being flaky isn't enough to prove
anything alone. Otoh it *did* behave just fine with the old
motherboard and the same kernel (a 1.6B one, which I've still got; I
only upgraded the kernel in the hopes that this was a software
problem that had been fixed) I unfortunately lack dmesg output from
then.

I can replicate the jumpiness at the least by ripping a CD (from a
SCSI CD-ROM attached to an aic7800), performing a dd if=3D/dev/zero
of=3D/mp3/tmp/foo (or the opposite, from the disk to /dev/null), even
lfs_cleanerd doing its thing.

Is my basic lack of knowledge about IA32 and IDE/ATA/whatever hardware
biting me here?

Is it relevant that I *think* wd0 is actually an Ultra/133 disk, but
that it was originally labeled and formatted on an Ultra/100
controller[1]? (I also *think* the controller on this motherboard
supports Ultra/133, but I haven't a clue how to go find out beyond
"it says so on the box".)

Are you just not supposed to have Ultra/100 and Ultra/66 disks on
the same bus? (Why not? That's silly. :^>)

Is this Acer IDE controller (not something I picked out very
carefully; it was a reasonably-priced Socket A motherboard and I
wanted my apartment's firewall back up) just trashy?

Has forcing my drive to Ultra-DMA mode 4 a chance of making it
happy? In the BIOS, or just in the kernel config (that is, something
like wd0 at pciide? channel 0 drive 0 flags 0xfcc)?

(Note that this isn't all that important, since what *really* matters
is, of course, on the SCSI disks. The Winchester cruft's just there
for /mp3 and the like. But I *like* being able to listen to music
while I work. And I *don't* like having to swap CDs. :^> And this
UDMA stuff, it should work, no?)

[1] Meaning, btw, that though it's a 160 GB disk, it appears to
only be a 120 GB one, since that's all Ultra/100 can see, which
leads to a corollary question: how's disklabel(8) going to cope
with that? I will surely at least have to relabel, but hopefully not
reformat, to get at the extra (less than) 40 GBs, right?

--=20
gabriel rosenkoetter
gr@eclipsed.net

--A6N2fC+uXW/VQSAv
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (NetBSD)

iD8DBQE9WYZE9ehacAz5CRoRAu0CAJ9nXAxG6BR/wc8XnGskOi8w2h19ugCeJ8Yc
pOdLJH9DOPIwE4AeizMO0Io=
=CXH7
-----END PGP SIGNATURE-----

--A6N2fC+uXW/VQSAv--