Subject: Re: DMA COHERENCY [ was Re: CVS commit: syssrc ]
To: Matthew Jacob <>
From: Manuel Bouyer <>
List: tech-kern
Date: 03/01/2001 19:34:59
On Tue, Feb 27, 2001 at 11:12:50AM -0800, Matthew Jacob wrote:
> > On Mon, Feb 26, 2001 at 01:39:48PM -0800, Matthew Jacob wrote:
> > > I still would have to change the sparc64 implementation to grok this I
> > > believe. Note that BUS_DMA_COHERENT for bus_dmamem_map means make the
> > > CPU's view coherent (i.e., set PG_NVC (no vcache)) of memory. There's no tying
> > > this to setting iommu TTE bits.
> > 
> > Ha OK, on sparc64 we have two places to deal with: CPU<->Memory and
> > Memory<->device, which have each their own cache, rigth ?
> Yes. It isn't just sparc- there are other architectures have this issue as
> well- alpha for example. But alpha has an instruction that can ensure
> coherency at bus_dmamap_sync time (mb).

This is between CPU and RAM, something that should be handled at
dmamem_map(BUS_DMA_COHERENT) time.
> > I still believe that memory bus_dmamem_map()'ed BUS_DMA_COHERENT should be
> > from CPU to device (and vice-versa), so both caches should be configured for
> > this.
> But you may not be mapping it into a CPU's virtual address space. Otherwise,
> yes, BUS_DMA_COHERENT for a CPU mapping should probably imply coherence for IO
> mappings as well.
> > > 
> > > And lacking any architectural *requirement* for requiring such ordering, in
> > > fact, I'd rather have a big fat rule breakage with a comment than things just
> > > working because the order of function calls makes things work.
> > 
> > Hum, you're rigth. For me it was obvious that a DMA map should be mapped
> > before loaded but this is definitively 2 different things. So maybe we
> > need to handle BUS_DMA_COHERENT in bus_dmamap_load* too, but the behavior
> > needs to be clearly specified before, and documentation updated at the
> > same time.
> That's the request I have on the table here for. It's not all that complex
> when you get down to this. I believe that all you need is to simply modify the
> man page to state that BUS_DMA_COHERENT maybe specified in bus_dmamap_load*,
> or you need to state that bus_dmamem_alloc'd memory implies BUS_DMA_COHERENT
> for both CPU and IO mappings.
>      bus_dmamap_load(tag, dmam, buf, buflen, p, flags)
> ....			BUS_DMA_NOWAIT    It is not safe to wait (sleep) for
> 			                  sources during this call.
> ++                      BUS_DMA_COHERENT  This flag is a request to the
>                                           machine dependent code that
>                                           any IOMMU mappings will be
>                                           established in a away to provide
>                                           for byte coherency for the mapped
>                                           memory. If this cannot be achieved
>                                           an error will be returned. Usage of
>                                           bus_dmamap_sync is still required.
>                                           The reason for this flag is that
>                                           some systems need to know at
>                                           IOMMU load time whether or not
>                                           the mapping will be to support
>                                           byte coherency or streaming
>                                           unidirectional I/O in order to
>                                           select which hardware bits to set.
> OR:
>      bus_dmamem_alloc(tag, size, alignment, boundary, segs, ...)
> .....
>             All pages allocated by bus_dmamem_alloc will be assumed
>             to be BUS_DMA_COHERENT even if never mapped via bus_dmamem_map.
>             This is so the bus_dmamap_load implementation will select
>             the correct hardware bits to use when establishing IOMMU mappings.

I prefer the fist. We may use memory not allocated from bus_dmamem_alloc()
BUS_DMA_COHERENT should be used with care, though, as this is not guaranteed
to work everywhere (I'm not sure a PCI device can be mapped coherent on
a mips CPU, for example).

Manuel Bouyer, LIP6, Universite Paris VI.