Subject: Re: Proposal for modification of bus_dma(9)
To: None <tech-kern@NetBSD.ORG, thorpej@nas.nasa.gov>
From: Ross Harvey <ross@teraflop.com>
List: tech-kern
Date: 02/01/1998 05:06:45
I do want to state for the record that I think bus_dma(9) is a work
of exceptional merit, and that I will continue to hold that opinion
even if BUS_DMAMEM_NOSYNC is allowed to live.

Also for the record, Avalon has not at all pressured Jason into
twisting the interface for our benefit.  The only actual request we
made was that the feature not be _used_ in a single driver project
Matthew Jacob was doing at NAS.

I point out that the NOSYNC mapping is the one and only case where the
actual hardware buffer (possibly not quite the same as what the driver
_thinks_ is the buffer) must be addressable by the kernel driver.  Except
for NOSYNC, bus_dma would, in an absurd example that demonstrates the
abstraction, allow the control of a dma peripheral not even on the same
computer!

This is the first I've heard of "partial sync", and I like it. It adds
efficiency and it "lightens" the interface. It applies directly, for
example, to an issue raised by Matt Thomas in the most recent driver
busification.  Seems like a win.

As an example of how NOSYNC might cause subtle general troubles:  on
processors that can reorder stores, how does a MI driver with a NOSYNC
mapping arrange for, say, a transmit descriptor address to be flushed
out prior to, say, the word containing the descriptor valid bit?!  I
bet we get away with murder sometimes by implicitly relying on the
atomicity of cache line fills and writebacks. And then one day the
descriptor is partly in one cache line, and partly in another and we
lose. (On some alphas, a word can sit in the write buffer for 256
cycles before being written.  It will be written prior to dma by the
coherency logic, but not necessarily before the "second" write was
seen first by the device.)

I suspect that in general, on systems that could have done NOSYNC, then
sync() will be a harmless no-op or near-no-op and therefore is not
worth avoiding.  I also suspect that there are or will be architectures
other than the A12 that cannot do NOSYNC mapping.

And I guess I will take Jason up on his correction offer: the A12 _can_
do direct, cache-coherent DMA from its main "bus", the crossbar switch
port, to any point in memory. It's just that PCI is a secondary bus on
the A12, and it has exactly the behavior and structure that Jason described.

Ross Harvey
ross@teraflop.com