Subject: Re: DMA and interrupt errors on DS10?
To: Johan A.van Zanten <johan@giantfoo.org>
From: Brad Beck <bradbeck@sdf1.org>
List: port-alpha
Date: 02/15/2005 04:12:18
For the record, I get the same errors on my PWS 600au MiataGL.  I get 
numerous 'stray irq' messages until it downgrades to PIO mode 4.  The 
root and other system partitions are mounted from 2 SCSI disks connected 
to the onboard controller.   1 Western Digital 80GB HDD is the primary 
master IDE drive (no slave) and mounts /home on /dev/wd0a.   1 Toshiba 
CD-ROM is the secondary master IDE drive (also no slave).
I've swapped everything possible at the hardware level in 
troubleshooting this.  It is most certainly not a hardware issue.

Dmesg is posted for posterity:


NetBSD 2.0 (GENERIC) #0: Tue Nov 30 21:04:03 UTC 2004
 
builds@build:/big/builds/ab/netbsd-2-0-RELEASE/alpha/200411300000Z-obj/big/builds/ab/netbsd-2-0-RELEASE/src/sys/arch/alpha/compile/GENERIC
Digital Personal WorkStation 600au, 598MHz, s/n
8192 byte page size, 1 processor.
total memory = 768 MB
(1896 KB reserved for PROM, 766 MB used by NetBSD)
avail memory = 744 MB
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21164A-0
cpu0: Architecture extensions: 1<BWX>
cia0 at mainbus0: DECchip 2117x Core Logic Chipset (Pyxis), pass 1
cia0: extended capabilities: 1<BWEN>
cia0: using BWX for PCI config access
pci0 at cia0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
tlp0 at pci0 dev 3 function 0: DECchip 21143 Ethernet, pass 3.0
tlp0: interrupting at dec 550 irq 0
tlp0: DEC , Ethernet address 00:00:f8:76:2d:67
nsphy0 at tlp0 phy 5: DP83840 10/100 media interface, rev. 1
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp0: 10baseT, 10baseT-FDX, 10base2, 10base5
sio0 at pci0 dev 7 function 0: Contaq Microsystems 82C693 PCI-ISA Bridge 
(rev. 0x00)
cypide0 at pci0 dev 7 function 1
cypide0: Cypress 82C693 IDE Controller (rev. 0x00)
cypide0: bus-master DMA support present
cypide0: primary channel wired to compatibility mode
cypide0: primary channel interrupting at isa irq 14
atabus0 at cypide0 channel 0
cypide1 at pci0 dev 7 function 2
cypide1: Cypress 82C693 IDE Controller (rev. 0x00)
cypide1: hardware does not support DMA
cypide1: primary channel wired to compatibility mode
cypide1: secondary channel interrupting at isa irq 15
atabus1 at cypide1 channel 0
ohci0 at pci0 dev 7 function 3: Contaq Microsystems 82C693 PCI-ISA 
Bridge (rev. 0x00)
ohci0: interrupting at isa irq 10
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
uhub0 at usb0
uhub0: Contaq Microsys OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
tga0 at pci0 dev 12 function 0: TGA2 pass 2, board type PS4d20
tga0: 1024 x 768, 32bpp, IBM561 RAMDAC
tga0: interrupting at dec 550 irq 8
wsdisplay0 at tga0 (kbdmux ignored): console (std, vt100 emulation)
ppb0 at pci0 dev 20 function 0: Digital Equipment DECchip 21152 PCI-PCI 
Bridge (rev. 0x02)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
isp0 at pci1 dev 4 function 0: QLogic 1020 Fast Wide SCSI HBA
isp0: interrupting at dec 550 irq 3
scsibus0 at isp0: 16 targets, 8 luns per target
Texas Instruments PCI1410 PCI-CardBus Bridge (CardBus bridge, revision 
0x01) at pci1 dev 10 function 0 not configured
isa0 at sio0
lpt0 at isa0 port 0x3bc-0x3bf irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 (mux ignored): console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 (mux ignored)
vga0 at isa0 port 0x3b0-0x3df iomem 0xa0000-0xbffff
wsdisplay1 at vga0 (kbdmux ignored)
sb0 at isa0 port 0x220-0x237 irq 5 drq 1: dsp v3.01
audio0 at sb0: half duplex, mmap, independent
midi at sb0 not configured
opl at sb0 not configured
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
spkr0 at pcppi0
isabeep0 at pcppi0
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
stray isa irq 14
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
Kernelized RAIDframe activated
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <IBM, DCAS-34330W, S65A> disk fixed
sd0: 4134 MB, 8205 cyl, 6 head, 171 sec, 512 bytes/sect x 8467200 sectors
sd0: sync (100.00ns offset 8), 16-bit (20.000MB/s) transfers, tagged 
queueing
sd1 at scsibus0 target 1 lun 0: <COMPAQPC, ST34371N, 0472> disk fixed
sd1: 4094 MB, 5172 cyl, 10 head, 162 sec, 512 bytes/sect x 8386000 sectors
sd1: sync (100.00ns offset 8), 8-bit (10.000MB/s) transfers, tagged queueing
stray isa irq 14
wd0 at atabus0 drive 0: <WDC WD800JB-00ETA0>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 76319 MB, 155061 cyl, 16 head, 63 sec, 512 bytes/sect x 156301488 
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(cypide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
atapibus0 at atabus1: 2 targets
cd0 at atapibus0 drive 0: <TOSHIBA CD-ROM XM-6202B, b\221\311\373\000, 
1110> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
cd0(cypide1:0:0): using PIO mode 4
stray isa irq 14
root on sd0a dumps on sd0b
root file system type: ffs
stray isa irq 14
stray isa irq 14; stopped logging
ohci0: 1 scheduling overruns
wd0: transfer error, downgrading to PIO mode 4
wd0(cypide0:0:0): using PIO mode 4
wd0a: DMA error writing fsbn 64 of 64-79 (wd0 bn 64; cn 0 tn 1 sn 1), 
retrying
wd0: soft error (corrected)
Warning: received processor correctable error.


Johan A.van Zanten wrote:
> On Wed, Feb 09, 2005 at 03:49:59PM -0600, Eric Schnoebelen wrote:
> 
>>- > 	I've got a stack of DS10L's sitting about here, and they
>>- > all do this "degradation dance" during the first access..  Once
>>- > they've done that, they run just fine..
>>- > 
>>- > 	And some of them have 80wire cables (round cables with
>>- > external shielding) and I still see the degradation.
> 
> 
>  
> Manuel Bouyer replied:
> 
>>- 
>>- Probably a problem in the driver then. Is there some programming docs 
>>- available for these chips ?
> 
> 
> 
> eric@cirr.com (Eric Schnoebelen) wrote:
> 
>>	I don't know.. I certainly don't have them.
>>
>>	According to SRM, the chips are ``Acer Labs M1543C IDE''
>>
>>	Does that help any?
> 
> 
> 
> Several days ago i reported the same problem with a DS10 and a new Seagate
> IDE drive. I thought it might be the cable that came with the DS10,
> because it definitely was not an 80-wire cable.
> 
>  I've borrowed a known good 80-wire cable, and i get the same errors, as
> soon as the drive starts being used.  After that initial "degradation
> dance" i have the same experience: everything calms down.
> 
>  There's dmesg from the machine below.
> 
> How can i help "fix" this? I'm a lousy programmer, but a decent sysadmin.
> 
>  -johan
> 
> 
> 
> Errors:
> 
> wd0: transfer error, downgrading to Ultra-DMA mode 1
> wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA data transfers)
> wd0h: DMA error reading fsbn 16 of 16-31 (wd0 bn 16; cn 0 tn 0 sn 16), retrying
> wd0: soft error (corrected)
> aceride0:0:0: lost interrupt
> 	type: ata tc_bcount: 2048 tc_skip: 0
> aceride0:0:0: bus-master DMA error: missing interrupt, status=0x21
> wd0: transfer error, downgrading to DMA mode 2
> wd0(aceride0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
> wd0h: DMA error reading fsbn 312168096 of 312168096-312168099 (wd0 bn 312168096; cn 309690 tn 9 sn 9), retrying
> aceride0:0:0: lost interrupt
> 	type: ata tc_bcount: 2048 tc_skip: 0
> aceride0:0:0: bus-master DMA error: missing interrupt, status=0x21
> wd0: transfer error, downgrading to PIO mode 4
> wd0(aceride0:0:0): using PIO mode 4
> wd0h: DMA error reading fsbn 312168096 of 312168096-312168099 (wd0 bn 312168096; cn 309690 tn 9 sn 9), retrying
> wd0: soft error (corrected)
> 
> 
> 
> dmesg:
> 
> Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004
>     The NetBSD Foundation, Inc.  All rights reserved.
> Copyright (c) 1982, 1986, 1989, 1991, 1993
>     The Regents of the University of California.  All rights reserved.
> 
> NetBSD 2.0 (PARATHA) #0: Sun Feb  6 22:06:04 CST 2005
> 	johan@sarasvati:/local/NetBSD/src/NetBSD-2.0/src/sys/arch/alpha/compile/obj.alpha/PARATHA
> COMPAQ AlphaServer DS10 466 MHz, s/n r020dqmz00
> 8192 byte page size, 1 processor.
> total memory = 1024 MB
> (2848 KB reserved for PROM, 1021 MB used by NetBSD)
> avail memory = 999 MB
> mainbus0 (root)
> cpu0 at mainbus0: ID 0 (primary), 21264-4
> cpu0: Architecture extensions: 303<PAT,MVI,FIX,BWX>
> tsc0 at mainbus0: 21272 Core Logic Chipset, Cchip rev 0
> tsc0: 2 Dchips, 1 memory bus of 16 bytes
> tsc0: arrays present: 512MB, 512MB, 0MB, 0MB, Dchip 0 rev 1
> tsp0 at tsc0
> pci0 at tsp0 bus 0
> pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
> sio0 at pci0 dev 7 function 0: Acer Labs M1543 PCI-ISA Bridge (rev. 0xc3)
> tlp0 at pci0 dev 9 function 0: DECchip 21143 Ethernet, pass 4.1
> tlp0: interrupting at dec 6600 irq 29
> tlp0: DEC , Ethernet address 08:00:2b:86:77:93
> tlp0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> tlp1 at pci0 dev 11 function 0: DECchip 21143 Ethernet, pass 4.1
> tlp1: interrupting at dec 6600 irq 30
> tlp1: DEC , Ethernet address 08:00:2b:86:77:a8
> tlp1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> aceride0 at pci0 dev 13 function 0
> aceride0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc1)
> aceride0: bus-master DMA support present
> aceride0: primary channel wired to compatibility mode
> aceride0: primary channel interrupting at isa irq 14
> atabus0 at aceride0 channel 0
> aceride0: secondary channel wired to compatibility mode
> aceride0: secondary channel interrupting at isa irq 15
> atabus1 at aceride0 channel 1
> esiop0 at pci0 dev 15 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
> esiop0: using on-board RAM
> esiop0: interrupting at dec 6600 irq 39
> scsibus0 at esiop0: 16 targets, 8 luns per target
> isa0 at sio0
> lpt0 at isa0 port 0x3bc-0x3bf irq 7
> com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
> com0: console
> com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> isabeep0 at pcppi0
> fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
> mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
> esiop0: switching to single-ended mode
> fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
> stray isa irq 15
> stray isa irq 15
> scsibus0: waiting 2 seconds for devices to settle...
> stray isa irq 14
> stray isa irq 14
> stray isa irq 14
> stray isa irq 14
> stray isa irq 14; stopped logging
> stray isa irq 15
> sd0 at scsibus0 target 0 lun 0: <IBM, DDYS-T09170N, S93E> disk fixed
> sd0: 8748 MB, 15110 cyl, 3 head, 395 sec, 512 bytes/sect x 17916240 sectors
> sd0: sync (50.00ns offset 31), 16-bit (40.000MB/s) transfers, tagged queueing
> sd1 at scsibus0 target 1 lun 0: <IBM, DDYS-T09170N, S93E> disk fixed
> sd1: 8748 MB, 15110 cyl, 3 head, 395 sec, 512 bytes/sect x 17916240 sectors
> sd1: sync (50.00ns offset 31), 16-bit (40.000MB/s) transfers, tagged queueing
> wd0 at atabus0 drive 0: <ST3160021A>
> wd0: drive supports 16-sector PIO transfers, LBA48 addressing
> wd0: 149 GB, 310101 cyl, 16 head, 63 sec, 512 bytes/sect x 312581808 sectors
> wd0: 32-bit data port
> wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
> wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA data transfers)
> atapibus0 at atabus1: 2 targets
> cd0 at atapibus0 drive 0: <COMPAQ  CDR-8435, , 0013> cdrom removable
> cd0: 32-bit data port
> cd0: drive supports PIO mode 4, DMA mode 2
> cd0(aceride0:1:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
> root on sd0a dumps on sd0b
> root file system type: ffs
> wd0: transfer error, downgrading to Ultra-DMA mode 1
> wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA data transfers)
> wd0h: DMA error reading fsbn 16 of 16-31 (wd0 bn 16; cn 0 tn 0 sn 16), retrying
> wd0: soft error (corrected)
> aceride0:0:0: lost interrupt
> 	type: ata tc_bcount: 2048 tc_skip: 0
> aceride0:0:0: bus-master DMA error: missing interrupt, status=0x21
> wd0: transfer error, downgrading to DMA mode 2
> wd0(aceride0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
> wd0h: DMA error reading fsbn 312168096 of 312168096-312168099 (wd0 bn 312168096; cn 309690 tn 9 sn 9), retrying
> aceride0:0:0: lost interrupt
> 	type: ata tc_bcount: 2048 tc_skip: 0
> aceride0:0:0: bus-master DMA error: missing interrupt, status=0x21
> wd0: transfer error, downgrading to PIO mode 4
> wd0(aceride0:0:0): using PIO mode 4
> wd0h: DMA error reading fsbn 312168096 of 312168096-312168099 (wd0 bn 312168096; cn 309690 tn 9 sn 9), retrying
> wd0: soft error (corrected)
>