Subject: FreeBSD Bus DMA (was Re: AdvanSys board support)
To: Justin T. Gibbs <>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-kern
Date: 06/10/1998 15:49:03

Thanks for taking the time to explain your changes to bus_dma.

Speaking for myself: My major interest is in networking and
network-related I/O, and I'm "chief portslave" for the pmax and mips

Again, just speaking for myself, enforcing a "hierarchical" arrangment
of busses and bus-space tags doesnt seem very compelling. Perhaps I've
had too much experience with weirdo workstation designs where the
bus-dma for a sub-bus, or baseboard bus, or bus bridge, needs to be
implemented completely differently than for the "parent" bus.  For
some systems, there could be multiple distinct topologies to the same
kind of bus, even on the same system. The example that always comes
first to my mind is a vax Unibus, which could be:

       via  integrated bus adaptor (730)
       via  UBA750 or UBA780 on the CPU backplane;
       via UBA780, via SBI-to-Abus SBI adaptor on an 8600/860;
       via a DWBUA on a vaxbi-backplane  machine (8200/8250/8300/8350)
       via a XMI to UBA adaptor on an XMI machine
       via a DWBUA on a vaxbi-bus via an XMI to vaxbi adaptor
	   (some of these never made much sense from a performance
	    standpoint, but peeople still used them.)

	    ... and so on.

Reading between the lines of your message suggests you're thinking
purely of scatter/gather lists in host memory or maybe in a device
mailbox.  There're other schemes that are better thought of as an
outboard TLB, which map bus addresses to host memory addresses.  that
was very common onearlier (e.g., DEC) hardware, for mapping from 16 to
32-bit bus addresses, and I I understand some of the Alphas still work
that way.  On these systems you need to allocate hardware mapping
resources all the way down the bus topology -- potentially allocating
a contiguous region of bus address-space for the DMA at *each*
intermediate bus, and setting up a bus-address-mapping TLB entry

I dont see quite how the combination of all the above -- on different
models of the same CPU family -- the that maps into a hierarchical
scheme with pure inheritance. (I'm sure other NetBSD developers could
come up with more examples) Am I misunderstanding what you mean by

2)  Lazy evaluation of dma-load operations.

Does this imply that requests get reorderd, or that the device is
blocked waiting for resources?  The constraints on network devices are
very different from "block" storage devices ( disk, tape, CD, random
SCSI devices, what-have-you).  I'd be very unhappy if large packets
got consitently reordered behind smaller packets while the driver
waited for "enough" resouces for the smaller packets. Maybe the lazy
eval is a win; I dunno.

I'm sure I've seen disagreement between you and Jason, over the costs
and benefits of creating "S/G lists" before.  Assuming neither of you
have changed your respective positions, how large a fraction of your
changesare the "S/G list" optimizations (to get closer to the MD
representation, e.g., for SCSI devices)?  That seems like the key
space and performace issue, yes?