tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SunFire v100 / Acer M5229 IDE DMA error workaround



On Wed, Oct 29, 2008 at 01:57:11PM -0400, Rafal Boni wrote:
> On Wed, Oct 29, 2008 at 12:22:13PM -0500, David Young wrote:
> > On Wed, Oct 29, 2008 at 12:05:39PM -0400, Rafal Boni wrote:
> > > Folks:
> > >   I've been taunted by the IDE interface on my SunFire V100 for a long
> > >   (LOOONG!) time with messages along the lines of:
> > > 
> > >   wdNN: DMA error writing fsbn xxxx of xxxx-yyy (wdN bn pppp; cn ccc tn 
> > > tt sn ss), retrying
> > >   wdN: soft error (corrected)
> > > 
> [...]
> > > ---8<------8<------8<------8<------8<------8<------8<------8<------8<---
> > > Index: pci/pciide_common.c
> > > ===================================================================
> > > RCS file: /cvsroot/src/sys/dev/pci/pciide_common.c,v
> > > retrieving revision 1.38
> > > diff -u -p -r1.38 pciide_common.c
> > > --- pci/pciide_common.c   18 Mar 2008 20:46:37 -0000      1.38
> > > +++ pci/pciide_common.c   29 Oct 2008 15:30:21 -0000
> > > @@ -737,7 +738,9 @@ pciide_dma_finish(v, channel, drive, for
> > >   ATADEBUG_PRINT(("pciide_dma_finish: status 0x%x\n", status),
> > >       DEBUG_XFERS);
> > >  
> > > - if (force == WDC_DMAEND_END && (status & IDEDMA_CTL_INTR) == 0)
> > > + /* XXXrkb: From FreeBSD; should probably add an evcnt here */
> > > + if (force == WDC_DMAEND_END && 
> > > +     ((status & (IDEDMA_CTL_INTR | IDEDMA_CTL_ACT)) != IDEDMA_CTL_INTR))
> > >           return WDC_DMAST_NOIRQ;
> > 
> > I have a hunch that this is not necessary.  After you introduce the new
> > bus_space_write_1() call, below, does the condition IDEDMA_CTL_INTR &&
> > IDEDMA_CTL_ACT ever occur?
> 
> The above is in fact necessary; the bus_space_write_1() on the DMA control
> register I added in pciide_dma_finish() may, however, not be.  That's left
> over from try N-1 after noticing that both FreeBSD and OpenBSD wrote to the
> DMA status register there to clear it's sticky bits.
> 
> My testing with just the status-clear in pciide_dma_finish() didn't help
> the issue I'm seeing, it merely changed the status values reported later,
> but they still contained both IDEDMA_CTL_INTR & IDEDMA_CTL_ACT.  In fact,
> the reason I resorted to this is that the debug messages I added showed
> that both IDEDMA_CTL_INTR & IDEDMA_CTL_ACT were set *everytime* the system
> reported an IDE DMA error.

Hmm.  What status values are reported?  The IDE controller is not
provoking a PCI abort with an errant DMA, is it?

You say that you are using both channels.  Is it possible that the
control/status registers on the two channels are not 100% independent,
or else that the two channels are not 100% independent?

> > 
> > >   /* stop DMA channel */
> > > @@ -752,6 +755,9 @@ pciide_dma_finish(v, channel, drive, for
> > >       BUS_DMASYNC_POSTREAD : BUS_DMASYNC_POSTWRITE);
> > >   bus_dmamap_unload(sc->sc_dmat, dma_maps->dmamap_xfer);
> > >  
> > > + /* Clear status bits */
> > > + bus_space_write_1(sc->sc_dma_iot, cp->dma_iohs[IDEDMA_CTL], 0, status);
> > > +
> > 
> > I may be missing something, but by my reading of a PCI IDE controller
> > spec that I scrounged off the web, it is important to acknowledge the
> > interrupt in this way.  ISTM that the code should already acknowledge
> > the interrupt by calling pciide_irqack().  Not so?
> 
> It probably is; as I said, I noticed that both Free- and OpenBSD do it
> in both places, so I thought it was worth trying.  I'll double-check to
> make sure this change isn't necessary for things to work with the above
> change.

Ok.  I'm curious what you find out.

> > Note that this write may not be flushed to the device, and the
> > interrupt deasserted, until a second call to pciide_dma_finish() calls
> > bus_space_read_1(, cp->dma_iohs[IDEDMA_CTL], ).  In other words, you
> > may take two interrupts per DMA completed.
> 
> Hmm, this is a theory that's worth exploring more, however, I would expect
> the bus_space calls to handle any necessary flushing.

I'm guessing that they do not. :-(

Dave

-- 
David Young             OJC Technologies
dyoung%ojctech.com@localhost      Urbana, IL * (217) 278-3933 ext 24


Home | Main Index | Thread Index | Old Index