Subject: [ was a commit wazoo- now a tech-kern issue ]
To: Eduardo Horvath , Jason R Thorpe <thorpej@zembu.com>
From: Matthew Jacob <mjacob@feral.com>
List: tech-kern
Date: 02/24/2001 00:08:10
[ moved to tech-kern for more substantive discussion ]

On 24 Feb 2001 eeh@netbsd.org wrote:

> 
> 	No,  bus_dmamap_sync aren't missing. 
> 
> 	Ed?
> 
> The SBus and some PCI controllers on Ultras have
> streaming caches on the IOMMU that need to be
> explicitly flushed.  Those flushes are quite
> expensive, so in most cases where a device
> will do small I/O operations to a fixed buffer,
> it is best to load the DMA mappings with the
> streaming buffers disabled.

Yes. I know. The problem here is that Jason/Izumi are right- and I'm as usual
a complete idiot for having not seen it in my haste to get the damned thing
working on the Ultra1.

The fact that this works on bus_dmamap_load_raw is a bug- not a feature. It
works because you *happen* call iommu_enter from iommu_dvmamap_load_raw
passing flags on through- not preserving the bit from the call to
bus_dmamem_map (iommu_dvmamem_map)- which wouldn't matter because the isp
driver calls this *later*.

But the problem is actually more fundamental here- there is information you
need that can *only* be passed in bus_dmamem_map (this is the *only* place in
the spec that allows passing of BUS_DMA_COHERENT- oh, well, there's
bus_dmamem_mmap)- but there's no ordering requirement that requires you to
call bus_dmamem_map prior to bus_dmamap_load[_raw]- nor is there any
requirement to do so *at all*. Therefore this architecture is missing
something substantial, I believe.

I suggest that bus_dmamem_alloc and bus_dmamap_load[_raw] be allowed the use
of BUS_DMA_COHERENT- this still has meaning if this is only for device views
of memory. Otherwise it may be too difficult to actually support some
architectures.

The only other 'legal' way around this is to use the BUS_DMA_BUS[1-4]
stuff for sun4u.

> 
> Actually, it does not look like you are sync-ing
> before and after every read and write to the request
> and response queues.  But that would certainly have
> a noticeable performance impact.

I am. 

-matt