Subject: Re: kern/26568: Yesterday's "pciide" `irqack fix' breaks Promise 202xx controllers
To: None <bouyer@antioche.lip6.fr>
From: List Mail User <track@Plectere.com>
List: netbsd-bugs
Date: 08/06/2004 13:27:07
	Manual,

	From my reading of the "pre-release/public/free" ATA specs, there
is nothing wrong with your code!  I think the Promise (at least the 202xx)
is just plain weird (like the parallel ports on SIIG PCI cards, it just
doesn't do what everyone else does)!  I don't think any of your other
changes should be needed;  Quite simply, writing a zero to a Promise 202xx
register in the routine "pciide_irqack()" takes out the Promises (ever try
to get ATAPI to work on a 202xx -- Promise claims it does, at least in Win2K
or WinXP).

	This should probably be dealt with almost as if it were a "quirk"
(It definately deservse a comment in the code), just don't `irqack' for
the 202xx.  The "extra" write in "pciide_irqack()" definitely almost
immediately causes a new interrupt (and I assume it doesn't for other
chips -- it seems that it should clear and cancel pending interrupts).

	BTW.  I've been running your patches for about an hour and a half,
but no "bogus intr"'s yet (I've got a several RAID5s - 6 disks - rebuilding
in turn, a "build.sh" at the top of tree and a "make -j 3 dependall" in
another tree all running -- What luck, just won't fail when you want!).
I had three ` printf lock-ups' in two and a half hours just after the
changes:)  But only one "bogus intr" on the Via chipset machine since
this morning's early mail (and reverted to 1.11, it only occurred once)
and before trying a new kernel with your patches.

	Thanks,

	paul shupak

>From netbsd-bugs-owner-track=Plectere.com@NetBSD.org Fri Aug  6 08:13:45 2004
>Delivered-To: netbsd-bugs@netbsd.org
>X-pt: isis.lip6.fr
>Date: Fri, 6 Aug 2004 17:03:39 +0200
>From: Manuel Bouyer <bouyer@antioche.lip6.fr>
>To: paul@Plectere.com
>Cc: gnats-bugs@gnats.NetBSD.org, netbsd-bugs@NetBSD.org
>Subject: Re: kern/26568: Yesterday's "pciide" `irqack fix' breaks Promise 202xx controllers
>References: <200408060924.i769OgWB024731@Plectere.com>
>Mime-Version: 1.0
>Content-Type: multipart/mixed; boundary="ibTvN161/egqYuK8"
>Content-Disposition: inline
>In-Reply-To: <200408060924.i769OgWB024731@Plectere.com>
>User-Agent: Mutt/1.4.2i
>X-Scanned-By: isis.lip6.fr
>Sender: netbsd-bugs-owner@NetBSD.org
>Precedence: list
>
>
>--ibTvN161/egqYuK8
>Content-Type: text/plain; charset=us-ascii
>Content-Disposition: inline
>
>On Fri, Aug 06, 2004 at 02:24:43AM -0700, paul@Plectere.com wrote:
>> >Fix:
>> 	
>> 	Revert the change for the Promises?  Maybe a further test on the
>> wdc state beyond the simple "wdcintr(wdc_cp)"? - Either way, please do not
>> write the "IDEDMA_CTL" during the interrupt without acknowledging the interrupt
>> to the hardware(i.e. the EOI dance on x86) (If a DMA is really pending, we
>> can get into the infinite-loop case described (remember, now the WDC `cause'
>> has been cleared) beginning when the outstanding DMA completes or we lose
>> the outstanding transaction - neither is a good choice;  The outstanding
>> request causes another "bogus" interrupt, etc), or look into a non-zero
>> return and doing the EOI dance to prevent redelivery of the same interrupt
>> (Note: the case in the Promise returns zero, if we're eating the interrupt,
>> we probably should return one -- i.e. "rv = 1;" ? - I didn't test this, but
>> it seem like it might be simple enough to work).
>
>Can you try the attached patch, then ?
>But now there is something I don't understand. How did it happen we
>didn't run in this infinite loop before ? If clearing the interrupt cause
>isn't enouth, then we should have seen the same behavior with the old code.
>If wdcintr() didn't claim the interrupt, there's no reasons it will claim
>it the next time.
>
>-- 
>Manuel Bouyer <bouyer@antioche.eu.org>
>     NetBSD: 26 ans d'experience feront toujours la difference
>--
>... patches below:
>