tech-kern: Re: Rev 1.19 of busdma.doc

Subject: Re: Rev 1.19 of busdma.doc
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Justin T. Gibbs <gibbs@freefall.freebsd.org>
List: tech-kern
Date: 11/11/1996 23:01:25
>On Sat, 09 Nov 1996 17:26:59 -0800 
> "Justin T. Gibbs" <gibbs@freefall.freebsd.org> wrote:
>
> > There is nothing here preventing it, but it would be nice to explicitly
> > allow the system to choose an inline variant of the specified function if i
>t
> > is 'known' to the system.  For example, I expect many of the SCSI drivers
> > to use a single common routine to do this, and I don't see any reason why
> > some arches might not want to short circuit the function calls.  I think
> > you should also need to pass the index of the SG entry you are creating
> > for reasons I'll make clear below.
>
>As I've stated before, the fact that several PC SCSI controllers may
>use a similar (or identical) s/g list format is simply irrelevant.  We
>should pick an abstraction and stick with it.

How does this violate the abstraction?  Why do function calls when you
don't have to, especially when that function call is going to do two
assignments and return?  This is just needless overhead.

Oh and its not just PC SCSI controllers, as I've stated before.  Its almost
any EISA or PCI busmastering card that uses this format.

> > How can you make this guarantee if you are going to support "compaction" of
> > the address to fit the limits of the target SG list.  I could imagine the
> > implementation attempting the map first and then resorting to compaction
> > and starting over with the first segment, only after it has exausted all of
> > the SG segments.  The only way to know if you fit is to walk the pages and
> > while you are walking the pages, you might as well fill in th SG values as
> > you go.  This is why I think this guarantee should be replaced by an
> > explaination as to why it may be called more than "nsegments" times and the
> > index should be added to bus_dmamap_load_func_t.
>
>I'm not sure what you mean by "compaction" here... For the purpose of
>my answer, I'll assume that you mean "single addr/len pair for multiple
>adjacent pages".

I mean, "make this fit in 16 SG segments".

>Passing an index to the callback is not only unnecessary, but it may
>not make sense in the context of the callback function.  For example,
>what if the s/g descriptor for a device is implemented as some sort
>of FIFO?
>
>The correct way for this to happen is for bus_dmamap_load() to "look ahead"
>and perform compaction on-the-fly, conforming to the constraints of
>the device as specified in bus_dmamap_create().  By doing this, you
>eliminate unnecessary calls to the callback altogether.  This is really
>just a trivial implementation detail, and doesn't warrant adding
>complexity to the interface.
>
>I'm sticking with the definition of the callback being called at most
>"nsegments" times, and not adding an index argument to the callback.

This forces the implementation to construct a temporary version of the SG
list and then itterate on that list only once its complete, any time the
number of SG segments is less then the number of pages to be transfered.
That seems a waste, especially if the OS takes some care to attempt to
allocate physically contiguous pages when it can (as is the case in
FreeBSD).  I think that the clients that have some kind of FIFO requirement
should pay this price, both in speed and complexity, themselves.

> > We also need an additional "BUS_DMA_COMPACTTOFIT" flag so that clients have
> > to explicitly ask for compaction to occur.  This allows clients that can
> > easily break up transfers to split the data up by filling their SG list up
> > and passing the residual back up to caller that generated the transaction.
> > At the same time, it cleanly addresses the needs of things like the st
> > driver.  For this to completly work, bus_dmamap_load would have to somehow
> > report the residual on error.
>
>Uhh... no.  First of all, the "st" driver shouldn't be at all concerned
>with how data is transfered at the lower level.

If the data is not transfered in one command, you can't read or write that
tape.  In this sense, the "st" driver is quite concerned with how data is
transfered at the lower level.

>In fact, the "st" driver
>has no way to pass such a flag to bus_dmamap_load().  I don't see how
>this cleanly addresses anything.

Come on, Jason.  This is software, anything is possible, just use your
imagination.  The controller drivers could look at a flag set in the
SCSI xfer to determine if this was a transfer that had this requirement.

>Secondly, in my little world, these DMA transactions are "atomic".  It's
>all or nothing.  I don't want to supply the rope that would allow for
>"most" of a transaction to occur, and have the transfer of the residual
>fail.  That stands the possiblity of introducing a lot of extra code
>complexity.

Not a lot of complexity in the bus dma code.  This is a client issue.  If
the SCSI code has to break up a transfer so that it can fit on a cheesy
controller, that's its business and of course it will have to recover
correctly in the event of a failure.  This is really no different then any
other SCSI command returning with a non 0 residual except for the fact that
a logical command cannot be signaled as complete until all "children"
physical commands are complete.

This is a very "real" use of this interface.  Its nice to see that it
doesn't matter.

>Jason R. Thorpe                                       thorpej@nas.nasa.gov
>NASA Ames Research Center                               Home: 408.866.1912
>NAS: M/S 258-6                                          Work: 415.604.0935
>Moffett Field, CA 94035                                Pager: 415.428.6939

--
Justin T. Gibbs
===========================================
  FreeBSD: Turning PCs into workstations
===========================================