tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Am I using bus_dma right?



> Let me try to simplify these concepts.

Thank you; that would help significantly.

>> I'm not doing read/write DMA.  [...]
> If you are not doing DMA you don't need to do any memory
> synchronization (modulo SMP issues with other CPUs, but that's a
> completely different topic.)

Oh, I'm doing DMA.  Just not read/write DMA.  (Buffer descriptors are
write-direction DMA only, data is read-direction DMA only.)

> The problem is many modern CPUs have write-back caches which are not
> shared by I/O devices.  So when you do a read operation (from device
> to CPU) you should:

> 1) Do a BUS_DMASYNC_PREREAD to make sure there is no data in the
> cache that may be written to DRAM during the I/O operation.

> 2) Tell the hardware to do the read operation.

> 3) When the transaction completes issue a BUS_DMASYNC_POSTREAD to
> make sure the CPU sees the data in DRAM not stale data in the cache.

Okay, here's the first problem.  There is no clear "transaction
completes".

The card has a DMA engine on it (a PLX9080, on the off chance you've
run into it before) that can DMA into chained buffers.  I set it up
with a ring of butters - a chain of buffers with the last buffer
pointing to the first, none of them with the "end of chain" bit set -
and tell it to go.  I request an interrupt at completion of each
buffer, so I have a buffer-granularity idea of where it's at, modulo
interrupt servicing latency.

This means that there is no clear "this transfer has completed" moment.
What I want to do is inspect the DMA buffer to see how far it's been
overwritten, since there is a data value I know cannot be generated by
the hardware that's feeding samples to the card (over half the data
pins are hardwired to known logic levels).

I've been treating it as though my inspection of a given sample in the
buffer counts as "transfer completed" for purposes of that sample.

> When you do a write operation you should:

> 1) Make sure the buffer contains all the data you want to transmit.

> 2) Do a BUS_DMASYNC_PREWRITE to make sure any data that may remain in
> the CPU writeback cache is flushed to memory.

> 3) Tell the hardware to do the write operation.

> 4) When the write operation completes... well it shouldn't matter.

...but, according to the 8.0 manpage, I should do a POSTWRITE anyway,
and going under the hood (this is all on amd64), I find that PREREAD is
a no-op and POSTWRITE might matter because it issues an mfence to avoid
memory access reordering issues.

> If you have a ring buffer you should try to map it CONSISTENT which
> will disable all caching of that memory.

CONSISTENT?  I don't find that anywhere; do you mean COHERENT?

> However, some CPUs will not allow you to disable caching, so you
> should put in the appropriate bus_dmamap_sync() operations so the
> code will not break on those machines.

For my immediate needs, I don't care about anything other than amd64.
But I'd prefer to understand the paradigm properly for the benefit of
potential future work.

> When you set up the mapping for the ring buffer you should do either
> a BUS_DMASYNC_PREREAD, or if you need to initialize some structures
> in that buffer use BUS_DMASYNC_PREWRITE.  One will do a cache
> invalidate, the other one will force a writeback operation.

I already PREWRITE the whole DMA-accessible area before telling the DMA
engine to start.

> When you get a device interrupt, you should do a BUS_DMAMEM_POSTREAD
> to make sure anything that might have magically migrated into the
> cache has been invalidated.

There is no interrupt involved, in general.  I request interrupts at
buffer boundaries, but the buffers are very big compared to most DMAed
blocks - a typical DMAed block is about 800 bytes, but the buffers are
half a meg.

> Then copy the data out of the ring buffer and do another
> BUS_DMASYNC_PREREAD or BUS_DMASYNC_PREWRITE as appropriate.

Then I think I was already doing everything necessary.  And, indeed, I
tried making the read routine do POSTREAD|POSTWRITE before and
PREREAD|PREWRITE after its read-test-write of the samples, and it
didn't help.

>> One of the things that confuses me is that I have no write-direction
>> DMA going on at all; all the DMA is in the read direction.  But
>> there is a driver write to the buffer that is, to put it loosely,
>> half of a write DMA operation (the "host writes the buffer" half).
> When the CPU updates the contents of the ring buffer it *is* a DMA
> write,

Well, maybe from bus_dma's point of view, but I would not say there is
write-direction DMA happening unless something DMAs data out of memory.

> even if the device never tries to read the contents, since the update
> must be flushed from the cache to DRAM or you may end up reading
> stale data later.

So I have to treat it like a DMA write even if there is never any
write-direction DMA actually going on?

Then the problem *probably* is not bus_dma botchery.

Someone else wrote me saying it was difficult to tell much without
actually seeing the code.  I've got that up in my anonymous FTP area:
ftp.rodents-montreal.org:/mouse/misc/7300a.c, .../7300a.h, and
.../7300a-reg.h.  (They are also accessible via HTTP, though I'm not
sure what Content-Type: .c and .h files get served as; try
http://ftp.rodents-montreal.org/mouse/misc/7300a.c etc for the HTTP
view of them).

I'm in the process of rewriting the driver in a more traditional
paradigm; I _think_ I can compensate for the API differences in the
userland code.  I'm wondering if there may be something bus_dma isn't
quite handling correctly here because I'm using it in an unusual way,
so I'm trying to see if I can get closer to its designed-for use case.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index