Subject: bus_dma_{read,write}*()
To: Jason R Thorpe <thorpej@zembu.com>
From: Eduardo Horvath <eeh@turbolinux.com>
List: tech-kern
Date: 04/20/2000 10:00:15
Note the New Improved (tm) subject line. 8^)

On Wed, 19 Apr 2000, Jason R Thorpe wrote:

> On Tue, Apr 18, 2000 at 09:09:00AM -0700, Eduardo Horvath wrote:

> So, I have a few questions/concerns.
> 
> First of all, I'm concerned that simply byte-lane flipping in the bus
> controller simply isn't going to handle all devices.
> 
> What about the device that works like this:
> 
> struct foo_desc {
> 	u_int16_t control;
> 	u_int16_t status;
> 	u_int32_t addr;
> };
> 
> In native little-endian order, it looks like this:
> 
> 	c0 c1 s0 s1 a0 a1 a2 a3
> 
> With bus-controller byte flipping, the result would be:
> 
> 	s1 s0 c1 c0 a3 a2 a1 a0
> 
> which is WRONG.  The result SHOULD be:
> 
> 	c1 c0 s1 s0 a3 a2 a1 a0

The H/W DTRT as follows:

The IOMMU in the SBus/PCI bus controller has an Invert Endianness bit
which is set when a page is mapped in.  I presume this will also flip
bytes based on the access width which is based on the DMA engine's
request.  The DMA engine can be just about anything including another CPU
which should be able to specify desired access width.

The Ebus DMA chip (DMA engine for HME) is wired to detect the access width
and flip the appropriate number of bits.  (There's a long section in the
manual that describes this.)  

The CPU's MMU has an Invert Endianness bit which will invert the
endianness of all accesses to that page of RAM based on the access width.

The CPU also has a selection of ASIs (Alternate Space Identifiers) that
will cause endianness inversion of individual loads and stores.

(The CPU also has a global Invert Endianness bit, but I don't expect to
use that.)

> Secondly.. more of a sanity check..
> 
> You would specify byte-order at DMA map creation time, right?  i.e. like:
> 
> 	error = bus_dmamap_create(sc->sc_dmat, sizeof(struct control), 1,
> 	    sizeof(struct control), 0, BUS_DMA_LITTLE, &sc->sc_cdmap);

I thought that would be a good way to do it.

> ...and then you would have to pass the DMA map to bus_dma_read_4() to
> read a value:
> 
> 	val = bus_dma_read_4(sc->sc_dmat, sc->sc_cdmap, &desc->addr);
> 
> ?
> 
> What if you want to have swapped data and octet-stream data in the same
> data structure mapped by the same DMA map?  There are real-world devices
> where this is necessary (Intel i82557 Ethernet).

You would also require a:

	bus_dma_read_multi_1(sc->sc_dmat, sc->sc_cdmap, &desc->addr,
		&val, sizeof(val));

that would not swap endiannes.  If you think there might be performance
issues we could also add:

	val = bus_dma_read_stream_4(sc->sc_dmat, sc->sc_cdmap,
		&desc->addr);

that does a straight copy without byte swapping.  But then things get ugly
when you need to deal with things that have a 16-bit bus and need swaps on
2-byte boundaries like ISP controllers.

Eduardo Horvath