tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: SunFire v100 / Acer M5229 IDE DMA error workaround
On Wed, Oct 29, 2008 at 02:19:04PM -0500, David Young wrote:
> On Wed, Oct 29, 2008 at 01:57:11PM -0400, Rafal Boni wrote:
> > On Wed, Oct 29, 2008 at 12:22:13PM -0500, David Young wrote:
> > > On Wed, Oct 29, 2008 at 12:05:39PM -0400, Rafal Boni wrote:
> > > > Folks:
> > > > I've been taunted by the IDE interface on my SunFire V100 for a
> > > > long
> > > > (LOOONG!) time with messages along the lines of:
> > > >
> > > > wdNN: DMA error writing fsbn xxxx of xxxx-yyy (wdN bn pppp; cn
> > > > ccc tn tt sn ss), retrying
> > > > wdN: soft error (corrected)
> > > >
> > [...]
> > > > ---8<------8<------8<------8<------8<------8<------8<------8<------8<---
> > > > Index: pci/pciide_common.c
> > > > ===================================================================
> > > > RCS file: /cvsroot/src/sys/dev/pci/pciide_common.c,v
> > > > retrieving revision 1.38
> > > > diff -u -p -r1.38 pciide_common.c
> > > > --- pci/pciide_common.c 18 Mar 2008 20:46:37 -0000 1.38
> > > > +++ pci/pciide_common.c 29 Oct 2008 15:30:21 -0000
> > > > @@ -737,7 +738,9 @@ pciide_dma_finish(v, channel, drive, for
> > > > ATADEBUG_PRINT(("pciide_dma_finish: status 0x%x\n", status),
> > > > DEBUG_XFERS);
> > > >
> > > > - if (force == WDC_DMAEND_END && (status & IDEDMA_CTL_INTR) == 0)
> > > > + /* XXXrkb: From FreeBSD; should probably add an evcnt here */
> > > > + if (force == WDC_DMAEND_END &&
> > > > + ((status & (IDEDMA_CTL_INTR | IDEDMA_CTL_ACT)) !=
> > > > IDEDMA_CTL_INTR))
> > > > return WDC_DMAST_NOIRQ;
> > >
> > > I have a hunch that this is not necessary. After you introduce the new
> > > bus_space_write_1() call, below, does the condition IDEDMA_CTL_INTR &&
> > > IDEDMA_CTL_ACT ever occur?
> >
> > The above is in fact necessary; the bus_space_write_1() on the DMA control
> > register I added in pciide_dma_finish() may, however, not be. That's left
> > over from try N-1 after noticing that both FreeBSD and OpenBSD wrote to the
> > DMA status register there to clear it's sticky bits.
> >
> > My testing with just the status-clear in pciide_dma_finish() didn't help
> > the issue I'm seeing, it merely changed the status values reported later,
> > but they still contained both IDEDMA_CTL_INTR & IDEDMA_CTL_ACT. In fact,
> > the reason I resorted to this is that the debug messages I added showed
> > that both IDEDMA_CTL_INTR & IDEDMA_CTL_ACT were set *everytime* the system
> > reported an IDE DMA error.
>
> Hmm. What status values are reported? The IDE controller is not
> provoking a PCI abort with an errant DMA, is it?
It's been a while since I ran the system without *any* changes to the
IDE subsystem, but I believe that in the case where things failed I'd
see status as either 0x25 (ACTIVE | INTR | DRV_DMA(0)) or 0x65 (ACTIVE |
INTR | DRV_DMA(0) | DRV_DMA(1)).
Strangely enough, after adding the reset of the DMACTL register in
pciide_dma_finish(), I'd see status as just 0x05 (ACTIVE | INTR).
As far as PCI aborts, I'm not sure... how would I go about looking
for that?
> You say that you are using both channels. Is it possible that the
> control/status registers on the two channels are not 100% independent,
> or else that the two channels are not 100% independent?
Anything is possible; this is after all Sun, who brought us the joys of
the CMD-646, though. But I think I would have been able to find it in
Google if that were the case... at least people complaining about the
state of general *BSD support... but the only think I've found are a
reference or two to FreeBSD bugs which look like have been since fixed.
(I haven't tried FreeBSD or OpenBSD on the box, mind you).
--rafal
--
Time is an illusion; lunchtime, doubly so. |/\/\| Rafal Boni
-- Ford Prefect |\/\/|
rafal%pobox.com@localhost
Home |
Main Index |
Thread Index |
Old Index