Subject: Re: Machine-independent bus DMA interface proposal
To: Justin T. Gibbs <gibbs@freefall.freebsd.org>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: tech-kern
Date: 09/22/1996 22:06:04
On Sun, 22 Sep 1996 21:02:20 -0700 
 "Justin T. Gibbs" <gibbs@freefall.freebsd.org> wrote:

 > My main problem with this proposal is that it doubles the space required
 > for sg entries and forces a copy of the bus_dma_segment_t information into
 > the private format of the driver.  This seems an enormous penalty

...and there is a _very_ good reason for this, namely hardware-format
scatter/gather descriptors should _never_ be accessed as structures
in a portable driver.  This just loses completely.  The fact that NetBSD
drivers currently do this type of access is a bug, which will have to
be fixed if those drivers want to be used on an architecture with more
strict structure alignment and packing restrictions.

In terms of performance penalty, in general, I'm not completely
convinced that

	a) it's really going to be significantly more expensive, and

	b) that the (probably marginal) performance win is worth
	   the architectural compromise.

At the same time, you want the access to the software scatter/gather
lists to be sane from the programmer's perspective.

Also, the point of this interface is to be device-, bus-, and
machine-independent.  In other words, the fact that 3 PC scsi cards
use the same scatter/gather list format is completely irrelevant in
the scope of designing such a DMA interface.

 > correspond directly to the private driver format.  I would much rather see
 > a family of functions that handle different sg formats which allows the
 > code to be shared among drivers (e.g. the ahb, bt, and aic7xxx, with the
 > exception of length of list have the same format) and kept in one place so
 > that they are easy to port among archs and update if the API changes.
 > This removes the need for translation code in each driver and no extra
 > mapping space is needed.

It's not clear to me that this addresses the concern of bus- and
machine-independent DMA.  I.e. what you're suggesting would
basically require machine-dependent portions for this "scsi card dma
scatter/gather descriptor" function.  "Yuck."  Besides, to address
the DMA mapping problem, you'd _still_ need an interface like this
one, so your suggestion would actually be _more_ expensive.

 > If you don't have a family of functions, the interface has to be enhanced
 > to deal with the restrictions of the different SG formats.  I don't see
 > a per SG size limit in the API and this varies from device to device.

A limit on the DMA segment size is a good suggestion ... (That's why I
posted this now :-)  As such, I've added a "maxsegsz" argument to
bus_dmamap_create() which specifies the maximum number of bytes
that may be trasfered by any given DMA segment.

I don't understand how the interface needs to change to deal with
different scatter/gather formats... Under _NO_ circumstances should
a driver make any assumptions about the size and layout of the
bus_dma_segment_t ...

I also don't see how this proposal (which has the effect of ripping
the vtophys() and kvtop() calls out of the drivers) is really more
of a lose than what we currently have.

 > I would also like to see the interface deal with DMA transactions that span
 > multiple contiguous KVAs (aka buffers).  I should be able to read tapes
 > from an SGI that use a 256k block size even if my tape drive is hanging
 > of an old 1542(16 SG segments).  Heck, I should be able to read tapes
 > written on a NetBSD system with a 256k block size too. I don't see any
 > of this happening unless we can circumvent the MAXBSIZE/MAXPHYS limits
 > by spanning buffers in a single I/O transaction.

Hmm ... interesting idea ... However, this means changing ... a number
of things... For example, how does one have multiple bufs in the first
place?  This could potentially mean chainging the interface to
a device's strategy routine (unless I'm missing something totally
obvious)... Seems beyond the scope of this proposal.

Anyhow, new rev follows...

 -- save the ancient forests - http://www.bayarea.net/~thorpej/forest/ -- 
Jason R. Thorpe                                       thorpej@nas.nasa.gov
NASA Ames Research Center                               Home: 408.866.1912
NAS: M/S 258-6                                          Work: 415.604.0935
Moffett Field, CA 94035                                Pager: 415.428.6939

 ----- snip -----

NAS $Id: busdma.doc,v 1.15 1996/09/23 04:57:10 thorpej Exp $

PURPOSE
-------

The purpose of this document is to describe a bus- and machine-independent
DMA mapping interface.


All data structures, function prototypes, and macros will be defined
by the port-specific header <machine/bus.h>.  Note that this document
assumes the existence of types already defined by the current "bus.h"
interface.


Unless otherwise noted, all function calls in this interface may be
defined as CPP macros.



DATA TYPES
----------

Individual implementations may name these structures whatever they
wish, providing that the external representations are:

	bus_dma_tag_t		A machine-dependent opaque type
				describing the implementation of
				DMA for a given bus.

	bus_dma_segment_t	A struct describing a DMA segment.

	bus_dma_handle_t	A pointer to a struct describing
				a complete DMA mapping.

	bus_dmasync_op_t	An enumerated type providing at
				least the following unique values:

		BUS_DMASYNC_PREREAD
		BUS_DMASYNC_POSTREAD
		BUS_DMASYNC_PREWRITE
		BUS_DMASYNC_POSTWRITE

				See bus_dmamap_sync() for more details.



/*
 * bus_dma_segment_t
 *
 *	Describes a single contiguous DMA transfer.
 *
 *	Individual implementations may choose arbitrary layout
 *	for this structure, and individual instances of this
 *	structure may be of arbitrary size.
 */
typedef struct {
	bus_addr_t	ds_addr;	/* address of segment in bus space */
	bus_size_t	ds_len;		/* length of this segment */

	/*
	 * Individual implementations may add "private" members
	 * here.  Drivers are not to assume the existence of
	 * any members other than the two above.
	 */
} bus_dma_segment_t;

/*
 * bus_dma_handle_t
 *
 *	Describes a complete DMA mapping.
 *
 *	dh_segments may be accessed by bus-master drivers to get
 *	the address and length of each transfer.  The number
 *	of valid segments in any particular mapping is kept
 *	in dh_nsegments.  If dh_nsegments is 0, the mapping is not
 *	valid.
 *
 *	Individual implementations may choose arbitrary layout
 *	for this structure, and individual instances of this
 *	structure may be of arbitrary size.
 *
 *	The dh_segments member may or may not be an array.
 */
typedef struct {
	bus_dma_segment_t *dh_segments;	/* dma segment(s) */
	int		dh_nsegments;	/* number of valid segments
					   in mapping */

	/*
	 * Individual implementations may add "private" members
	 * here.  Drivers are not to assume the existence of
	 * any members other than the two above.
	 */
} *bus_dma_handle_t;



FUNCTIONS
---------

int	bus_dmamap_create __P((bus_dma_tag_t tag, bus_size_t size,
	    int nsegments, bus_size_t maxsegsz, int flags,
	    bus_dma_handle_t *dmahp));

	bus_dmamap_create() allocates a dma handle and initializes
	it according to the paramters provided.  Arguments are
	as follows:

	tag		This is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	size		This is the maximum DMA transfer that can
			be mapped by the handle.

	nsegments	Number of segments the device can support
			in a single DMA transaction.  This may
			be the number of scatter-gather descriptors
			supported by the device.

	maxsegsz	The maximum number of bytes that may be
			transfered by any given DMA segment.

	flags		Flags are defined as follows:

		BUS_DMA_WAITOK		It is safe to wait (sleep)
					for resouces during this call.

		BUS_DMA_NOWAIT		It is not safe to wait (sleep)
					for resources during this call.

		BUS_DMA_ALLOCNOW	Perform any resource allocation
					this handle may need now.
					If this is not specified, the
					allocation may be deferred to
					bus_dmamap_load().  If this flag
					is specified, bus_dmamap_load()
					will not block on resource
					allocation.

		BUS_DMA_BUS[1-4]	These flags are placeholders,
					and may be used by busses to
					provide bus-dependent functionality.

	dmahp		This is a pointer to a bus_dma_handle_t.
			A dma handle will be allocated and
			pointed to by *dmahp upon sucessful completion
			of this routine.

	Behavior is not defined if invalid arguments are passed to
	bus_dmamap_create().

	RETURN VALUES

	Returns 0 on success or an error code to indicate mode of failure.



void	bus_dmamap_destroy __P((bus_dma_tag_t tag, bus_dma_handle_t dmah));

	bus_dmamap_destroy() frees all resources associated with a
	given dma handle.  Arguments are as follows:

	tag		This is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	dmah		The dma handle to destroy.

	In the event that the dma handle contains a valid mapping,
	the mapping will be unloaded via the same mechansim used
	by bus_dmamap_unload().

	Behavior is not defined if invalid arguments are passed
	to bus_dmamap_destroy().

	RETURN VALUES

	If given valid arguments, bus_dmamap_destroy() always suceeds.



int	bus_dmamap_load __P((bus_dma_tag_t tag, bus_dma_handle_t dmah,
	    caddr_t kva, size_t size, int flags));

	bus_dma_load() loads a dma handle with mappings for a
	DMA transfer.  Arguments are as follows:

	tag		This is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	dmah		The dma handle with which to map the
			transfer.

	kva		The kernel virtual address of the buffer
			to be mapped.

	size		The size of the DMA transfer.  This must
			be <= the size given to bus_dmamap_create().

	flags		Flags are defined as follows:

		BUS_DMA_WAITOK		It is safe to wait (sleep)
					for resources during this call.

		BUS_DMA_NOWAIT		It is not safe to wait (sleep)
					for resources during this call.

		BUS_DMA_BUS[1-4]	These flags are placeholders,
					and may be used by busses to
					provide bus-dependent functionality.

	As noted above, if a dma handle is created with
	BUS_DMA_ALLOCNOW, bus_dmamap_load() will never block.

	After a succssful call to bus_dmamap_load(), the publicly
	accessible members of the dma handle will contain the
	following:

		dh_segments	Will point to one or more
				bus_dma_segment_t's which contain
				the "pa" and "length" values
				appropriate for programming into
				DMA controller registers.

		dh_nsegments	Will contain the number of segments
				pointed to by dh_segments.

	Behavior is not defined if invalid arguments are passed to
	bus_dmamap_load().

	RETURN VALUES

	Returns 0 on success or an error code to indicate mode of failure.



void	bus_dmamap_unload __P((bus_dma_tag_t tag, bus_dma_handle_t dmah));

	bus_dmamap_unload() deletes the mappings for a given
	dma handle.  Arguments are as follows:

	tag		This is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	dmah		The dma handle containing the mappings
			which are to be deleted.

	If the dma handle was created with BUS_DMA_ALLOCNOW,
	bus_dmamap_unload() will not free the corresponding
	resources which were allocated by bus_dmamap_create().
	This is to ensure that bus_dmamap_load() will never block
	on resources if the handle was created with BUS_DMA_ALLOCNOW.

	Behavior is not defined if invalid arguments are passed to
	bus_dmamap_unload().

	RETURN VALUES

	If given valid arguments, bus_dmamap_unload() always suceeds.



void	bus_dmamap_sync __P((bus_dma_tag_t tag, bus_dma_handle_t dmah,
	    bus_dmasync_op_t op));

	bus_dmamap_sync() performs pre- and post-DMA operation
	cache and/or buffer synchronization.  Arguments are as follows:

	tag		This is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	dmah		The DMA mapping to be synchronized.

	op		The synchronization operation to perform.

	The following DMA synchronization operations are defined:

	BUS_DMASYNC_PREREAD		Perform any pre-read DMA cache
					and/or bounce operations.

	BUS_DMASYNC_POSTREAD		Perform any post-read DMA cache
					and/or bounce operations.

	BUS_DMASYNC_PREWRITE		Perform any pre-write DMA cache
					and/or bounce operations.

	BUS_DMASYNC_POSTWRITE		Perform any post-write DMA cache
					and/or bounce operations.

	This function exists so that multiple read and write transfers
	can be performed with the same buffer, and so that drivers can
	explicitly inform the bus DMA code when their data is 'ready'
	in its DMA buffer.

	An example of multiple read-write use of a single mapping
	might look like:

	bus_dmamap_load(...);

	while (not done) {
		/* invalidate soon-to-be-stale cache blocks */
		bus_dmamap_sync(..., BUS_DMASYNC_PREREAD);

		[ do read DMA ]

		/* copy from bounce */
		bus_dmamap_sync(..., BUS_DMASYNC_POSTREAD);

		/* read data now in driver-provided buffer */

		[ computation ]

		/* data to be written now in driver-provided buffer */

		/* flush write buffers and writeback, copy to bounce */
		bus_dmamap_sync(..., BUS_DMASYNC_PREWRITE);

		[ do write DMA ]

		/* probably a no-op, but provided for consistency */
		bus_dmamap_sync(..., BUS_DMASYNC_POSTWRITE);
	}

	bus_dmamap_unload(...);


	If DMA read and write operations are not preceeded and followed
	by the apropriate synchronization operations, behavior is
	undefined.

	Behavior is not defined if invalid arguments are passed to
	bus_dmamap_sync().

	RETURN VALUES

	If given valid arguments, bus_dmamap_sync() always succeeds.



caddr_t	bus_dmamem_alloc __P((bus_dma_tag_t tag, size_t size,
	    int nsegments, int flags));

	bus_dmamem_alloc() allocates memory that is "DMA safe"
	for the bus corresponding to the given tag and maps it
	into kernel virtual address space.  Arguments are
	as follows:

	tag		The is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	size		The amount of memory to allocate.

	nsegments	Specifies the maximum number of segments
			that may compose the allocated memory.  For
			example, if this value is 1, the entire
			allocated memory region must be physically
			contiguous.  If this value is 2, the allocated
			memory region may consist of up to 2
			physically contigious segments, etc.

	flags		Flags are defined as follows:

		BUS_DMA_WAITOK		It is safe to wait (sleep)
					for resources during this call.

		BUS_DMA_NOWAIT		It is not safe to wait (sleep)
					for resources during this call.


	Behvavior is undefined if invalid arguments are passed to
	bus_dmamem_alloc().

	RETURN VALUES

	Returns the kernel virtual address of the allocated memory
	or NULL if the request cannot be satisfied.



void	bus_dmamem_free __P((bus_dma_tag_t tag, caddr_t kva,
	    size_t size));

	bus_dmamem_free() unmaps and frees memory previously allocated
	by bus_dmamem_alloc().  Arguements are as follows:

	tag		This is the bus_dma_tag_t passed down from the
			parent driver via <bus>_attach_args.

	kva		The kernel virtual address of the memory
			region to be freed.

	size		The amount of memory that was allocated
			but bus_dmamem_alloc().

	Behavior is undefined if invalid arguments are passed to
	bus_dmamem_free().

	RETURN VALUES

	If given valid arguments, bus_dmamem_free() always succeeds.