Subject: Re: [RFC] Interface to hardware-assisted data movers
To: Jason R Thorpe <thorpej@wasabisystems.com>
From: None <cgd@broadcom.com>
List: tech-kern
Date: 07/18/2002 15:33:30
At Thu, 18 Jul 2002 15:22:29 -0700, Jason R Thorpe wrote:
>  > * DMOVER_REQ_WAIT: You probably want an additional flag which forces
>  >   the processing code to spin, rather than sleeping.  Horrible, yes,
>  >   but desirable if you're using it in a place where you can't sleep
>  >   but do know that the data mover will be substantially faster or
>  >   provide other benefits as compared to a CPU-based copy.  (an example
>  >   of this might be a pmap zero page routine.)
> 
> The restriction, of course, is that that can never be called from an
> interrupt handler.  Any thoughts on a name?  Where should the spinwait
> be done?  In the back-end or the midlayer?

<thinking out loud...>

Sure it would be generally _undesirable_ to invoke a spin-wait move
from an interrupt handler...  but i don't know that it's the right
thing to _force_ that restriction...  (I don't know that that
restriction is necessary for correct implementation.)

So, I don't think i've looked at the implementation, but given that
e.g. the completion fns are called at soft interrupt level, I've been
assuming that the code to handle that soft interrupt generation was in
the midlayer.  Similar for the code that sleeps when DMOVER_REQ_WAIT
is specified.  If that's the case, I'd expect that the spin loop code
would go in the same place.  Hmm, but then right, wouldn't be able to
call from interrupt context, since dmover interrupt on done might be
blocked, unless there were a 'poll for completion' mechanism...

hmm, maybe just punt and say "don't call from interrupt context."  8-)



>  > * so, all these immediate value things...  should probably be a union,
>  >   or even just a single 64-bit type w/ e.g. fill8 specified to use
>  >   only the low byte value.  maybe if you want to do something even
>  >   better (i heard you talking about byte order problems), maybe an
>  >   array of bytes, and specify how they're copied (and that fill8 uses
>  >   the first one, fill16 uses the first 2, etc.)
> 
> In the code, they are all union'd together.  I've been considering making
> it:
> 
> 	uint8_t dreq_imm[8];
> 
> to eliminate any confusion about byte ordering.  Thoughts?

yes, i thought i just suggested something like that.  8-)


>  > * Is it the only interface by which one can get request structures, or
>  >   may they be allocated by other means?  if the latter is OK, specify
>  >   that they must be zeroed so that we can add things in the future w/o
>  >   too much pain.  (e.g., priority, but i don't know if that should go
>  >   on session or request.)
> 
> Well, nothing frees the request structures implicitly, so certainly clients
> could provide their own.  However, I'm trying to keep ABI considerations
> in mind.  I suppose I could note the ABI consideration in the document, and
> explicitly allow callers to provide their own request structures if they
> want to do so.

So, hmm.

I wonder if having a mechanism to allocate an array of the structures
would in fact then be best?

I mean, i could easily see a driver wanting to allocate a "set" of
request structures on attach, then using them, and that might be
better/easier for it than allocating a bunch individually.

Same goes with the input buffers...

Some part of me says, just make callers use these interfaces and make
the interfaces always handle allocation of the appropriate buffer
arrays, etc.

>  > * what happens if you destroy a session while requests are active?
>  >   What happes if you free a request while it is active?
> 
> Right now, it panics.  I haven't addressed that yet.

Well, I think panicing may be a reasonable thing to do, but if so
those cases should be documented as being illegal.  8-)



chris