Subject: Re: esp/fas
To: None <eeh@netbsd.org>
From: Andrei Petrov <and@genesyslab.com>
List: port-sparc64
Date: 01/30/2001 12:34:05
On 30 Jan 2001 eeh@netbsd.org wrote:

> 
> 	It's the same DVMA problem. The very new kernel (as of today)
> 	doesn't cause dma errors, works for some time(longer then before), but
> 	ends up in ddb with:
> 
> 	iommu_strbuf_flush: flush timeout 0x100000000 at 0x3cbbbc8
> 
> The IOMMU's streaming buffer is flushed by poking a 
> register with a value and then waiting until the 
> IOMMU has written something to a particular memory 
> location.  This should occur within 1/2 second or
> it's considered a timeout.  
> 
> There arre a couple of possible reasons for the
> timeout:
> 
> 1) The streaming buffer is not enabled.  If you 
> have coherent mappings the streaming buffer is not
> used.

I think it's all coherent, I'll check this.

> 
> 2) Interrupt latency/clock problems.  If the clock
> is running too fast you may not wait long enough.
> 
> 3) There is some discrpency bettween the varible
> being waited on and the address changed by the
> IOMMU.

These 2 still sound like magic. :-)  Back to reading.

> 
> In any case, this should not be a fatal condition,
> and you can probably work around it by disablinng
> the streaming cache entirely or using only COHERENT
> mappings.
> 
> Altought, looking at the code:
> 

There are also:

	if (!is->is_sb)
		return (0);

	is->is_flush = 0;

> #ifdef DIAGNOSTIC
> 	if (!is->is_flush) {
> 		printf("iommu_strbuf_flush: flush timeout %p at %p\n",
> 		    (void *)(u_long)is->is_flush, 
> 		    (void *)(u_long)is->is_flushpa); /* panic? */
> 

and another one:
		Debugger();

So I'd expect this code to fire up debugger right after it gets there
after streaming buffer gets enabled. And this happen after a while.
I havn't find yet a place where that buffer is enabled.

I'll try to continue from ddb and see what happens.

> if is->is_flush really is 0x100000000, either the code generation
> is broken, or the value was set between the test and the printf ()
> call.  Maybe you should just increase the timeout a bit.

I can compile without optimization, it helped me before with 
code generation. Don't understand yet that 'timeout' magic. 

And finally, this doesn't happen in 1.5K. This made me worry mostly
due to the question should I go with current or stay with 1.5K.
I'd like to work on fas driver now because it's 'half done', medium-raw;-)
If I won't find fast solution to this problem I just stay with 1.5K.

Thanks,
	Andrey