Subject: BUS_DMA_CONTROLDATA flag for bus_dma(9)
To: None <tech-kern@netbsd.org>
From: Jason Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 08/13/2003 18:16:53
Hi folks...

Currently, bus_dma(9) has no real special support for "control data", 
i.e. buffer descriptors used for network interfaces, s/g lists for disk 
controllers, etc.  However, because of the nature of control data, it 
sometimes requires special handling.

Right now, this is sort of "kludged" using the BUS_DMA_COHERENT flag.  
That flag is used as a "hint" to the back-end that the memory should be 
mapped in a "coherent" fashion.  However, the bus_dma(9) documentation 
notes that this flag should not be relied upon for correct operation.

In the course of addressing performance issues on some platforms, some 
people (including myself) have changed some network drivers to not rely 
on BUS_DMA_COHERENT for control data.  On platforms like XScale, this 
means that the control data will be mapped cacheable, and thus 
write-combined when it makes its way out to SDRAM.  This 
write-combining effect is actually what we're looking for.

Unfortunately, the nature of control data means that this does not 
always work.  Consider a device who's buffer descriptors don't quite 
map to cache lines on the host.  A problem with the wm(4) driver on a 
PowerPC platform was recently attributed to this issue; re-adding 
BUS_DMA_COHERENT fixed the problem on this particular platform.

There is another problem with mapping this control data cached -- it 
pollutes the data cache.  The information in this control data is 
almost immediately thrown away, because the control data needs to be 
consulted later to know the status of the transfer.

What I'd like to do is add another flag to bus_dma(9) to specifically 
address the special needs of control data.  This flag would tell the 
back-end that it MUST take the special action required to correctly 
manipulate control data.  On this PowerPC platform where we had the 
wm(4) problem, it would map the data non-cached.  On an XScale 
platform, it would map the data non-cached but write-bufferable (so we 
still get the write-combining benefit).  bus_dmamap_sync() would still 
be required for proper operation (as it would need to e.g. drain the 
write buffer on XScale).

So, my proposed change is:

	* Add a BUS_DMA_CONTROLDATA flag to bus_dmamem_map().  This
	  tells bus_dmamem_map() that it needs to map the memory in
	  a way appropriate for control data.

	* Add a BUS_DMA_CONTROLDATA flag to bus_dmamap_load*().  This
	  tells bus_dmamap_load*() that the region being loaded into
	  the DMA map is control data.  This allows it to mark the
	  DMA map appropriately so that it can adjust the operation
	  of bus_dmamap_sync().

Use of BUS_DMA_CONTROLDATA in both sets of calls would be required for 
correct operation; using these flags incorrectly results in undefined 
behavior.

Comments?

         -- Jason R. Thorpe <thorpej@wasabisystems.com>