Subject: Re: Proposal for modification of bus_dma(9)
To: Ross Harvey <ross@teraflop.com>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: tech-kern
Date: 02/02/1998 19:47:24
[ With any luck, this message will actually make it out to the world. The
network problems at MAE West are becoming a ... large source of annoyance
for me right now. ]
On Sun, 1 Feb 1998 05:06:45 -0800 (PST)
Ross Harvey <ross@teraflop.com> wrote:
> I do want to state for the record that I think bus_dma(9) is a work
> of exceptional merit, and that I will continue to hold that opinion
> even if BUS_DMAMEM_NOSYNC is allowed to live.
Heh, thanks :-) Given some comments from Jonathan Stone, I think I
may indeed allow it to live, but in a slightly different way; see below.
> Also for the record, Avalon has not at all pressured Jason into
> twisting the interface for our benefit. The only actual request we
> made was that the feature not be _used_ in a single driver project
> Matthew Jacob was doing at NAS.
Yah, I didn't mean to imply anything other than the A12 hardware has
caused me to think about it some more :-)
> I point out that the NOSYNC mapping is the one and only case where the
> actual hardware buffer (possibly not quite the same as what the driver
> _thinks_ is the buffer) must be addressable by the kernel driver. Except
> for NOSYNC, bus_dma would, in an absurd example that demonstrates the
> abstraction, allow the control of a dma peripheral not even on the same
> computer!
That's a very good point. And I think that allowing this has value
in some interesting applications.
> This is the first I've heard of "partial sync", and I like it. It adds
> efficiency and it "lightens" the interface. It applies directly, for
> example, to an issue raised by Matt Thomas in the most recent driver
> busification. Seems like a win.
[ Example: re-ordered stores. ]
So, here is my somewhat fleshed out idea on how to change the bus_dma
interface to deal with these problems:
(1) Change the BUS_DMAMEM_NOSYNC flag to BUS_DMA_COHERENT. This
is a *hint*, and nothing more. It will be passed only to
bus_dmamem_map(). The semantics are:
bus_dmamem_map: If possible on a given platform,
map the memory in such a way as it will be DMA
coherent. This may include mapping the pages into
unchached address space or setting the cache-inhibit
bits in page table entries. If a given platform cannot
map the host RAM in a way that is DMA coherent, this flag
is ignored.
bus_dmamap_load*: When a dmamap is loaded, the
machine-dependent code will take whatever action
is necessary to determine if the memory is mapped
in a DMA coherent way. This may include checking
if the KVA lies in uncached address space or if
the page table entries have the cache-inhibited bits
set. If so, state is kept in the dmamap to indicate
this to later invocations of bus_dmamap_sync().
(2) Add the following public member to bus_dmamap_t, per
my previos message:
int dm_mapsize; The size of the current DMA mapping.
A size of 0 indicates the mapping is
invalid.
Note that dm_mapsize == 0 replaces dm_nsegs == 0 as the
standard way of determining of a dmamap contains a
valid mapping.
(3) Change the bus_dmamap_sync() interface per my previous
message:
void bus_dmamap_sync __P((bus_dma_tag_t tag,
bus_dmamap_t dmamap, bus_addr_t offset,
bus_size_t len, int ops));
offset offset into the mapping to synchronize
len length of mapping from offset to synchronize
ops one or more DMA synchronization operations
Valid synchronization operations:
BUS_DMASYNC_PREREAD
BUS_DMASYNC_PREWRITE
BUS_DMASYNC_POSTREAD
BUS_DMASYNC_POSTWRITE
Synchronization operations are expressed from the perspective
of the host RAM, e.g. a device -> memory operation is a READ,
and a memory -> device operation is a WRITE.
bus_dmamap_sync() may consult state within the dmamap to
determine if the memory is mapped in a DMA coherent way.
If so, bus_dmamap_sync() may elect to skip certain expensive
operations, such as flushing the data cache (esp. on systems
which cannot flush specific ranges of the cache).
On platforms which implement re-ordered stores, bus_dmamap_sync()
will always cause the store buffer to be flushed.
So, in the case of Matt's Tulip transmit descriptor (assuming all of
the transmit descriptors are mapped by a single DMA mapping, for
simplicity):
/*
* Fill in the descriptor with the mapping just created for
* this mbuf chain.
*
* No need to POSTREAD|POSTWRITE here, since that was done
* when the last "transmit complete" interrupt occured for
* this descriptor.
*/
txdesc[idx].addr = dmamap->dm_segs[0].ds_addr;
txdesc[idx].len = dmamap->dm_segs[0].ds_len;
bus_dmamap_sync(dmat, dmamap, TXDESCOFF(idx), TXDESCSIZE,
BUS_DMASYNC_PREWRITE);
txdesc[idx].flags |= TXDESC_VALID;
bus_dmamap_sync(dmat, dmamap, TXDESCOFF(idx), TXDESCSIZE,
BUS_DMASYNC_PREREAD|BUS_DMASYNC_PREWRITE);
Note that this will not cause any real additional overhead on systems
which don't need all of this song and dance, because the synchronization
would optimize out to a noop.
I can get cracking on this this week if we reach consensus, here.
Jason R. Thorpe thorpej@nas.nasa.gov
NASA Ames Research Center Home: +1 408 866 1912
NAS: M/S 258-5 Work: +1 650 604 0935
Moffett Field, CA 94035 Pager: +1 415 428 6939