Subject: suggestion re bus_dma(9) "dma memory allocation"
To: None <tech-kern@netbsd.org>
From: Chris Torek <torek@bsdi.com>
List: tech-kern
Date: 08/17/2003 14:14:39
I am not sure anyone will really care about this, but I have
a suggestion that I think could help clean up / shorten the
code in drivers, just a bit.

Currently, if I understand the bus_dma(9) man pages, to allocate
kernel-virtual memory that will be shared with some bus-attached
device (such as ring descriptors for an Ethernet or control blocks
for a SCSI adapter), one must do the following:

    int error;
    bus_size_t shared_size;
    bus_dmamap_t dm;
    dma_segment_t segs_array[NSEGS];
    int actual_segments;

    shared_size = number_of_descriptors * size_of_descriptors;

    /* all of this is assuming no error */
    error = bus_dmamap_create(tag, shared_size, NSEGS,
	MAX_SEG_SIZE, BOUNDARY, flags, &dm);
	    /* MAX_SEG_SIZE and BOUNDARY are device constraints */

    error = bus_dmamem_alloc(tag, shared_size, alignment, boundary,
	&segs_array[0], NSEGS, &actual_segments, flags);

    error = bus_dmamem_map(tag, segs_array, actual_segments, &kva, flags);

    error = bus_dmamap_load(tag, dm, kva, size, NULL, flags);

    [now we can talk to the shared memory via "kva"
     and the device can talk to it via dm->dm_segs[i].ds_addr,
     for dm->dm_nsegs values of i]

There is an awful lot of repetition here.  In particular, the
"bus_dmamap_t" actually contains almost everything bus_dmamem_alloc()
needs.  (In practically all cases -- maybe always -- NSEGS
is 1, but this is more or less irrelevant.)  More generally, we
can say -- well, *I* can say, since I actually named a lot of this
stuff originally :-) -- that the *intent* of a bus_dmamap_t data
structure is to designate any hardware-imposed constraints on "DMA
side" addressing.  For instance, if some bus provides 32-bit
addressing, but a given device is only able to use 16- or 24-bit
addressing, the "boundary" might be (1<<16) or (1<<24).  (These
are the bounce buffers on the i386 for instance.)

[The "alignment" argument is, somewhat inappropriately, separated
from this data structure.  Furthermore, the use of the segs_array
as "magic cookies" to be handed to bus_dmamem_map() is suspect at
best.  But all this is historical artifact; onward.]

Now, if a bus_dmamap_t tells you about the DMA-side addressing,
and most of this information is redundant, why not present a
simplified interface:

    /* again, "assuming no error" */
    error = bus_dmamap_create(... as before ...);
    error = bus_dmamem_getshared(tag, dm, alignment, &kva, flags);

(again, the alignment really belongs inside "dm", but it is too
late for that).  The allocated memory would be mapped at the provided
kernel virtual address, and device-accessible via the dm->dm_nsegs
dm->dm_segs[i].ds_addr and .ds_len bus addresses.

To tear down the mapping and release the memory:

    error = bus_dmamem_relshared(tag, dm, kva);

This still has a minor flaw: currently SEGS_SIZE almost has to be
1, and with getshared/relshared, it *would* have to be 1, because
we have only one kva to map the various segments (which might
otherwise require discontiguous kernel addresses as well as
discontiguous bus addresses -- imagine situations in which the
kernel page size(s) differs from the I/O-device page size(s)).  If
there is hardware that can actually access SCBs, rings, whatever,
across multiple discontiguous segments, one might need getshared
to take "max_number_of_kvas, &kva_array[0], &actual_number_of_kvas".
But I think this is not something to worry about.

This is not something I consider enormously important, but I think
it would make the act of allocating shared control structures a
lot simpler and more obvious.  Note that bus_dmamap_sync() calls
are still needed for control structures that are cached on the CPU
and/or hardware-device level, but there are no changes other than
in the alloc-and-free sequence (well, someday, the sneaky use of
ds_addr/ds_len pairs as magic cookies can go away too).

Chris