Subject: Re: Preliminary work on ath(4)
To: Bruce J.A. Nourish <bjan+current-users@bjan.net>
From: Sam Leffler <sam@errno.com>
List: current-users
Date: 08/20/2003 16:32:25
> On Wed, Aug 20, 2003 at 02:00:37PM +0200, Frank van der Linden wrote:
>> On FreeBSD and NetBSD bus_dma, what you do is basically
>> the following:
>>
>> 	* FreeBSD creates bus_dma tags from parent tags, and stores
>> 	  some parameters in them. We don't do that. Just use the
>> 	  bus_dma tag passed in through the attach argument structure.
>> 	  But, remember some of the parameters passed to the FreeBSD
>> 	  bus_dmatag_create call, they are used in some NetBSD
>> 	  bus_dma calls explicitly (where FreeBSD only passes the tag)
>>
>> 	* The callback function for the bus_dmamap_load function in
>> 	  FreeBSD: just inline the callback function from the FreeBSD
>> 	  source directly after the bus_dmamap_load call in NetBSD.
>
> Thanks indeed for the advice, but I'm still in fairly deep water here.
> Perhaps Sam or yourself could tell me if I'm understanding the current
> (i.e. FreeBSD-designed) code correctly:
>
> * There is a device-wide DMA tag: sc->sc_dmat.
> * These is a device-wide area of DMA memory, described by sc->sc_ddmamap,
>   created, alloc'd and loaded early in ath_desc_alloc. A pointer to the
>   "KVA mapping" (whatever that is) of this memory is placedin sc->sc_desc
>   by bus_dma_alloc. The callback from bus_dma_load puts a pointer to
>   the pysical address of the memory into sc->sc_desc_paddr.
> * There is an area of memory, provided by malloc(), pointed to by
>   sc->sc_bufptr.
> * The following then takes place:
>
> 	ds = sc->sc_desc;
> 	bf = sc->sc_bufptr;
> 	[...]
>         TAILQ_INIT(&sc->sc_rxbuf);
>         for (i = 0; i < ATH_RXBUF; i++, bf++, ds++) {
>                 bf->bf_desc = ds;
>                 bf->bf_daddr = sc->sc_desc_paddr +
>                     ((caddr_t)ds - (caddr_t)sc->sc_desc);
>                 error = bus_dmamap_create(sc->sc_dmat, BUS_DMA_NOWAIT,
>                                           &bf->bf_dmamap);
>                 if (error != 0)
>                         break;
>                 TAILQ_INSERT_TAIL(&sc->sc_rxbuf, bf, bf_list);
>         }
>
>
>   Three questions at this point:
>
>   - How does the kernel know what size to alloc for the DMA memory
>     at sc->sc_desc? FreeBSD's bus_dmamap_alloc does not take a size,
>     wheras NetBSD's does. What size should I use?

The size comes from the tag.  This is why there's a separate tag created 
each time a new allocation size is needed.

>   - How do I get the equivalent of a "KVA mapping" using NetBSD? I would
>     need that in order to make a correspondence between descriptors and
>     buffers.

I don't recall what the equivalent NetBSD call is.

>   - Why does this code create a separate DMA map for each buffer? Surely
>     they are all the same (i.e. each call to dmamap_create returns an
>     identical data structure)? But if so, why not just bcopy them?
>

You need a separate map for each chunk of memory you map.  On x86 this is a 
noop unless you're using bounce buffers.  Again I don't recall the 
equivalent netbsd specifics.

I suggest you pick a netbsd driver and understand it.  Then apply those 
techniques to the ath driver.  There is nothing especially difficult about 
dealing with dma for the Atheros h/w; it's just a bunch of memory areas 
that you map for DMA.  No special coherency requirements.  The descriptors 
may require a little extra work to insure coherent access.

> That is the immediate problem I'm struggling with. However, there are
> a couple of showstoppers that are becoming apparent:
>
> * We have no taskqueue API, or anything remotely like it.
> * We don't have mutexes, although we have simple locks. The taskqueue
>   API in FreeBSD requires mutexes, although a cursory glance suggests
>   that it would be possible to substitute simple locks for mutexes
>   (in particular, it uses no recursive mutexes).
>
> Changing if_ath.c to avoid using these features would amount to rewriting
> most of the driver, which is beyond my spare time if not ability. Although
> I might be able to hack FreeBSD's taskqueue_subr.c to use locks, and get
> it to compile, (a) doing so would make it very difficult to debug the
> driver,  because many, many more things could go wrong and (b) adding an
> important, general API is something that should be done _right_.
>
> I'm willing to keep plugging away at this, but I'm not going to spend
> hours fooling around with bus_dma stuff, only for it all to be in vain
> due to a lack of taskqueues.

I think you want to do away with taskqueues under netbsd.  They are used to 
defer interrupt processing from the ISR to another context.  I believe the 
netbsd model is to do this processing directly in the interrupt context so 
just call the taskqueue processing routine directly from ath_intr and fixup 
the logic to conform.

Likewise ignore mutex's and think spl's instead.  For this you need to 
think of protecting against threads of execution instead of synchronizing 
access to data structures.  For example, whenever you're operating in the 
top-half of the kernel (e.g. in the start routine) and you want to insure 
noone comes along and modifies your data structures you raise the IPL 
appropriately.  Under FreeBSD this sort of synchronization is done with 
mutex's on data structures because you can potentially enter the driver 
other than through interrupts (e.g. via kernel preemption or another 
processor).

In the end you're likely to have a very different driver.  But when it 
works I can steal it back for use under FreeBSD -stable! :)

	Sam