Subject: DMA COHERENCY [ was Re: CVS commit: syssrc ]
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Matthew Jacob <mjacob@feral.com>
List: tech-kern
Date: 02/27/2001 11:12:50
> On Mon, Feb 26, 2001 at 01:39:48PM -0800, Matthew Jacob wrote:
> > I still would have to change the sparc64 implementation to grok this I
> > believe. Note that BUS_DMA_COHERENT for bus_dmamem_map means make the
> > CPU's view coherent (i.e., set PG_NVC (no vcache)) of memory. There's no tying
> > this to setting iommu TTE bits.
> 
> Ha OK, on sparc64 we have two places to deal with: CPU<->Memory and
> Memory<->device, which have each their own cache, rigth ?

Yes. It isn't just sparc- there are other architectures have this issue as
well- alpha for example. But alpha has an instruction that can ensure
coherency at bus_dmamap_sync time (mb).

> I still believe that memory bus_dmamem_map()'ed BUS_DMA_COHERENT should be
> from CPU to device (and vice-versa), so both caches should be configured for
> this.

But you may not be mapping it into a CPU's virtual address space. Otherwise,
yes, BUS_DMA_COHERENT for a CPU mapping should probably imply coherence for IO
mappings as well.


> > 
> > And lacking any architectural *requirement* for requiring such ordering, in
> > fact, I'd rather have a big fat rule breakage with a comment than things just
> > working because the order of function calls makes things work.
> 
> Hum, you're rigth. For me it was obvious that a DMA map should be mapped
> before loaded but this is definitively 2 different things. So maybe we
> need to handle BUS_DMA_COHERENT in bus_dmamap_load* too, but the behavior
> needs to be clearly specified before, and documentation updated at the
> same time.


That's the request I have on the table here for. It's not all that complex
when you get down to this. I believe that all you need is to simply modify the
man page to state that BUS_DMA_COHERENT maybe specified in bus_dmamap_load*,
or you need to state that bus_dmamem_alloc'd memory implies BUS_DMA_COHERENT
for both CPU and IO mappings.

     bus_dmamap_load(tag, dmam, buf, buflen, p, flags)
...			BUS_DMA_NOWAIT    It is not safe to wait (sleep) for
			                  sources during this call.

++                      BUS_DMA_COHERENT  This flag is a request to the
                                          machine dependent code that
                                          any IOMMU mappings will be
                                          established in a away to provide
                                          for byte coherency for the mapped
                                          memory. If this cannot be achieved
                                          an error will be returned. Usage of
                                          bus_dmamap_sync is still required.

                                          The reason for this flag is that
                                          some systems need to know at
                                          IOMMU load time whether or not
                                          the mapping will be to support
                                          byte coherency or streaming
                                          unidirectional I/O in order to
                                          select which hardware bits to set.

 
OR:


     bus_dmamem_alloc(tag, size, alignment, boundary, segs, ...)
....
            All pages allocated by bus_dmamem_alloc will be assumed
            to be BUS_DMA_COHERENT even if never mapped via bus_dmamem_map.
            This is so the bus_dmamap_load implementation will select
            the correct hardware bits to use when establishing IOMMU mappings.


AND/OR:

     bus_dmamem_alloc(tag, size, alignment, boundary, segs, ...)
...			BUS_DMA_NOWAIT    It is not safe to wait (sleep) for
			                  sources during this call.

++                      BUS_DMA_COHERENT  This flag is a request to the
                                          machine dependent code that
                                          memory allocated here will always
                                          be mapped (whether by CPU or
                                          IO MMUs) to support byte coherency
                                          or unidirectional I/O streaming.



OR:

    If you need to ensure byte coherency for both IOMMU and CPU caches,
    you *must* call bus_dmamem_map prior to bus_dmamap_load with the flag
    BUS_DMA_COHERENT.


(wrt the last: *barf*)

-matt