Subject: Re: FreeBSD Bus DMA (was Re: AdvanSys board support)
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Justin T. Gibbs <gibbs@plutotech.com>
List: tech-kern
Date: 06/10/1998 23:01:04
>Again, just speaking for myself, enforcing a "hierarchical" arrangment
>of busses and bus-space tags doesnt seem very compelling. Perhaps I've
>had too much experience with weirdo workstation designs where the
>bus-dma for a sub-bus, or baseboard bus, or bus bridge, needs to be
>implemented completely differently than for the "parent" bus.  For
>some systems, there could be multiple distinct topologies to the same
>kind of bus, even on the same system. The example that always comes
>first to my mind is a vax Unibus, which could be:

At the device driver level, some tag, already constructed to deal with this
particular attachment, is passed into the driver.  The idea is that this
tag can be further refined by the bus specific attach code to dictate the
generic DMA capabilities of the device.  For example:

	PCI bus dma tag passed into a pci device's attach routine
		- Indicates it can handle a 32 or 64 bit address space
		  depending on host bridge.
		- Specifies no limit on the number of S/G segments that
		  can be specified.
		- Places no boundary constraints.
		- Places no alignment contraints.

	pci device attach routine derives a tag from the passed in parent
		- Adds device specific boundary constraints.
		- Adds device specific alignment constraints.

	pci device calls MI attach routine which derives a tag for each
	specific type of allocation
		- tag specifies number of S/G segments
		- tag specifies maximum number of concurrent transactions
		- tag specifies any application specific alignment
		  and boundary constraints.

In the FreeBSD implementation, the parent tags are usually only referenced
during tag instantiation so that restrictions from the parent are honored
by the child.  How the base parent tag is constructed is up to particular
port and the MD constraints of the hardware.  In some instances, like the
x86 port, a hierarchical approach for building these tags just falls out.
For others, the implementation is free to do whatever makes sense.  The
inheritance is enforced near the leafs where it aids the task of MI driver
code interfacing with MD/MB attach front ends.


>Reading between the lines of your message suggests you're thinking
>purely of scatter/gather lists in host memory or maybe in a device
>mailbox.  There're other schemes that are better thought of as an
>outboard TLB, which map bus addresses to host memory addresses.  that
>was very common onearlier (e.g., DEC) hardware, for mapping from 16 to
>32-bit bus addresses, and I I understand some of the Alphas still work
>that way.  On these systems you need to allocate hardware mapping
>resources all the way down the bus topology -- potentially allocating
>a contiguous region of bus address-space for the DMA at *each*
>intermediate bus, and setting up a bus-address-mapping TLB entry
>appropriately.

I was not thinking purely in terms of S/G I/O, and it is unclear to me
why you believe that a tag is not still free to allocate whatever resources
it needs in any way that it sees fit.  The FreeBSD implementation only 
differs from NetBSD in this regard when you look at the dma map object.
In FreeBSD, it is completely opaque which allows you to pass a client a
shared resource if it makes sense to implement the map this way.  In the
FreeBSD implementation, if no mapping or bouncing is required in order to
satisfy the requests of a client, the returned map is always the "dummy 
map" object.  This again saves on space for this particular implementation.
Will it save space for other implementations?  I can't say, but I've 
already show how this increased flexibility can be advantageous.

>I dont see quite how the combination of all the above -- on different
>models of the same CPU family -- the that maps into a hierarchical
>scheme with pure inheritance. (I'm sure other NetBSD developers could
>come up with more examples) Am I misunderstanding what you mean by
>hierarchy?

Most likely.  Perhaps my example above clears this up.

>2)  Lazy evaluation of dma-load operations.
>
>Does this imply that requests get reorderd, or that the device is
>blocked waiting for resources?  The constraints on network devices are
>very different from "block" storage devices ( disk, tape, CD, random
>SCSI devices, what-have-you).  I'd be very unhappy if large packets
>got consitently reordered behind smaller packets while the driver
>waited for "enough" resouces for the smaller packets. Maybe the lazy
>eval is a win; I dunno.

The CAM SCSI layer is quite paranoid about keeping the order of
transactions the same as that specified by the client.  So, the CAM driver
code takes care to freeze it's internal transaction queue as soon as a
request is deferred, and un-freeze it's queue once resources become
available.  The FreeBSD bus-dma code only imposes FIFO ordering of deferred
requests.  It didn't seem correct to have the bus-dma code enforce stricter
ordering semantics that may not apply to all clients.

>I'm sure I've seen disagreement between you and Jason, over the costs
>and benefits of creating "S/G lists" before.  Assuming neither of you
>have changed your respective positions, how large a fraction of your
>changesare the "S/G list" optimizations (to get closer to the MD
>representation, e.g., for SCSI devices)?  That seems like the key
>space and performace issue, yes?

In order to cut the space, you must move to a callback to win in all cases.
This change has the largest impact of any of the FreeBSD changes. The
AdvanSys controllers, for instance, simply PIO their S/G list directly to
the card (not a great design, but that's what it is) so no static storage
of any type is wanted.  If you are willing to force a single S/G copy, you
would have to export the S/G list format in some way into the MI code so
that it could be constructed properly.  This could turn nasty.

--
Justin