Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Acer M5229 IDE bugs (esp. on sparc64)



Folks:
        The discussion re: Acer M5229 IDE controllers in the V100 (my
        main home file / mail / shell server) and a link I'd saved from
        long ago in my bookmarks which I happened to trip over today
        made me dig into the issues I've been seeing on my SunFire V100,
        namely a pretty steady stream of:

wd1d: DMA error reading fsbn 160621568 of 160621568-160621631 (wd1 bn 177397712;
 cn 175989 tn 12 sn 44), retrying
wd1: soft error (corrected)

        messages in /var/log/messages.  The disks seem to work just fine
        for some reason, but it does make me worry.

        Here's the IDE controller / disk related bits of my boot messages:

aceride0 at pci0 dev 13 function 0
aceride0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc3)
aceride0: bus-master DMA support present
aceride0: primary channel configured to native-PCI mode
aceride0: using ivec 180c for native-PCI interrupt
atabus0 at aceride0 channel 0
aceride0: secondary channel configured to native-PCI mode
atabus1 at aceride0 channel 1
wd0 at atabus0 drive 0: <ST3120026A>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 111 GB, 232581 cyl, 16 head, 63 sec, 512 bytes/sect x 234441648 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
atapibus0 at atabus1: 2 targets
cd0 at atapibus0 drive 1: <CD-224E, , P.9A> cdrom removable
cd0: drive supports PIO mode 4, DMA mode 2
wd1 at atabus1 drive 0: <ST3120026A>
wd1: drive supports 16-sector PIO transfers, LBA48 addressing
wd1: 111 GB, 232581 cyl, 16 head, 63 sec, 512 bytes/sect x 234441648 sectors
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd1(aceride0:1:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
cd0(aceride0:1:1): using PIO mode 4, DMA mode 2 (using DMA)

        As I said, I tripped over a bookmark I'd make to a FreeBSD commit
        which supposedly fixed this ([1]), which claims that the firmware
        incorrectly sets the device to "use the ATA66 byte counter instead
        of triggering an interrupt at the zero count of the transfer buffer
        counter" (see bug report and analysis in [2]).  I'm not sure how
        my setup works given this analysis, but it does seem to be happy
        with 2 RAIDFrame mirror sets on wd0 / wd1 and I've experienced no
        data corruption (AFAIK! ;)).  Actually, going back through console
        logs of the machine, I do see log messages about downgrading to
        lower UMDA modes, and even PIO mode 4 in one case (mmm, slow), so
        maybe that's what saves me.
        
        NOTE ALSO (sorry for the shout!) that the FreeBSD code implies all
        revs of the chips <= 0xc4 can't handle DMA in LBA48-mode... I'm not
        sure how/if NetBSD's IDE subsystem handles this, so this might be
        another hold we could fall into... (fortunately? for me, I'm using
        drives which don't need LBA48).  This is probably a question for
        Manuel.

        I'm attaching a as-of-yet untested patch based on the FreeBSD code
        and PR for at least the DMA-completion bit -- see below.

Comments appreciated,
--rafal

[1] 
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/ata-chipset.c.diff?r1=1.126.2.1&r2=1.126.2.2&f=h
[2] http://www.freebsd.org/cgi/query-pr.cgi?pr=82261
[3] Patch for the DMA-completion-interrupt issue below; untested (I didn't
    even check it compiles, but will do so and test it later tonight).

Index: aceride.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/aceride.c,v
retrieving revision 1.24
diff -u -p -u -p -r1.24 aceride.c
--- aceride.c   1 Jan 2008 14:57:06 -0000       1.24
+++ aceride.c   14 Feb 2008 00:01:32 -0000
@@ -192,8 +192,13 @@ acer_chip_map(struct pciide_softc *sc, s
        interface = PCI_INTERFACE(pci_conf_read(sc->sc_pc, sc->sc_tag,
            PCI_CLASS_REG));
 
-       /* From linux: enable "Cable Detection" */
        if (rev >= 0xC2) {
+               /* From FreeBSD: use device interrupt as byte count end */
+               pciide_pci_write(sc->sc_pc, sc->sc_tag, ACER_0x4A,
+                   pciide_pci_read(sc->sc_pc, sc->sc_tag, ACER_0x4A) | 
+                       ACER_0x4A_INT_ZC);
+
+               /* From linux: enable "Cable Detection" */
                pciide_pci_write(sc->sc_pc, sc->sc_tag, ACER_0x4B,
                    pciide_pci_read(sc->sc_pc, sc->sc_tag, ACER_0x4B)
                    | ACER_0x4B_CDETECT);
Index: pciide_acer_reg.h
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/pciide_acer_reg.h,v
retrieving revision 1.11
diff -u -p -u -p -r1.11 pciide_acer_reg.h
--- pciide_acer_reg.h   25 Dec 2007 18:33:41 -0000      1.11
+++ pciide_acer_reg.h   14 Feb 2008 00:01:33 -0000
@@ -43,6 +43,9 @@
  */
 #define ACER_0x4A_80PIN(chan)  (0x1 << (chan))
 
+/* use device interrupt as byte count end */
+#define ACER_0x4A_INT_ZC       0x20
+
 /* From FreeBSD, for UDMA mode > 2 */
 #define ACER_0x4B      0x4b
 #define ACER_0x4B_UDMA66       0x01

-- 
  Time is an illusion; lunchtime, doubly so.     |/\/\|           Rafal Boni
                   -- Ford Prefect               |\/\/|      
rafal%pobox.com@localhost


Home | Main Index | Thread Index | Old Index