Subject: Re: [RFC] Interface to hardware-assisted data movers
To: None <cgd@broadcom.com>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 07/10/2002 08:26:46
On Fri, Jun 21, 2002 at 05:49:45PM -0700, cgd@broadcom.com wrote:

 > * suggest possibly splitting dmover front-end interface and back-end
 >   interface manual pages and maybe changing function names.  e.g.:
 > 
 > 	dm_*
 > 	dmbackend_* or something.
 > 
 >   Users of the interface really don't care about half of this
 >   document.  8-)

Yah, easy enough.  I'll make a sweep through that today.

 > * I'm a bit concerned about the separation of this from the
 >   'transform' API.
 > 
 >   A single piece of HW may be able to provide both types of interfaces
 >   ([dm] generic data movement, zero, prefetch-into-l2, copy; [xform]
 >   generic up-to-32bit CRC generation, IP checksum).
 > 
 >   It's therefore important to be able to have the two seamlessly
 >   integrated into a single back-end driver, and I dunno how well that
 >   will be possible with this API.
 > 
 >   Also, note that one of the impacts of that is that, say, if the
 >   device has a certain sustainable throughput/bandwidth, both types of
 >   uses will contribute to reaching that limit.

If a hardware device can provide both things, then it can present itself
using both dmover and xform, and resolve the two queues internally.

Regarding load ... yah, that's an issue.  More on that below...

 > * the load balancing algorithm, etc., seems a bit ad-hoc.
 >   additionally, static assignments of sessions to back-ends for all
 >   times also seems limiting.  why restrict by describing it that way?
 > 
 >   random thought that popped into my head: if you have some kind of HW
 >   assist module which gets removed from the system (!!), in current
 >   scheme all dmover clients who happened to have their sessions
 >   assigned to that module will need to squish and create sessions
 >   anew.
 > 
 >   requirement that hw be used first is kinda lame...  what if your xor
 >   engine is maxed out but you've got a dual-processor system that's
 >   idle waiting on xors to finish?

So, the issue here is that you want to avoid having to look up which
device can handle your request each time you want to issue one.

I suppose what could be done here is that a per-session list of backends
which can perform the request could be created when the session is created.
Most of the time, this list will be short (one or two elements), and the
selection would be fast from this shorter list (no need to compare strings,
etc.).

This gets problematic with xform, unfortunately.  Many devices require
a "constructor" type operation to be performed when a session is created
(allocate some data structure, etc.), and then there's the issue of the
"setkey" operation, as well.  I suppose the hardware-specific session
creation could be done lazily, however...

Perhaps a good interim step is to allow for the load-balancing to happen
in the spec, but still nail down session-to-hardware in the first version
of the code.

Regarding "always choose hw assist" ... The reason that hardware is
chosen over software is that hardware is asynchronous.  In just about
every application I've encountered, even if the throughput of the
device is lower than the CPU, it's better to let the CPU work on processing
more requests, rather than doing the dmover or xform operations.

If you have a spare CPU to throw at the problem, then one solution is
to bind that CPU to a chunk of code that does dmover/xform stuff, and
treat it like a device.

 > * dmover_request struct:
 > 
 > 	* dreq_imm{8,16,64} don't seem to be used by any existing
 > 	method.

They're there for future expandability.

 > 	* for the 'fill' methods, is there any reason you don't just
 > 	use & require a simple linear buffer to get the data?  (i.e.,
 > 	supply a ptr to the value rather than just the value in the
 > 	struct?)  That would simplify the interface a bit, with
 > 	approximately no cost.  dreq_imm32 would then go away.
 > 	(annoying thing about this is, then you need persistent
 > 	storage for asynchronous use of the data...)

Yah, I wanted to avoid having to provide persistent storage for
immediate values.

 > 	* hmm, maybe make ptr to inbuf?  would be nice if
 >         dmover_request structs were fixed sized (could be poolified),
 >         plus variable-sized structs are always fun to allocate.
 > 
 > 	* the above should be unioned together, if anything more than
 > 	a ptr to a dreq_inbuf is left.
 > 
 > 	* should include a count of dreq inbuf array elements, to
 >         allow back-end to sanity check.

Yah, this is all good.  I'll take a look at this.

 > * use of hard-coded strings is nice for extensibility, but error prone
 >   and a bit wasteful.  Suggest providing #defines for "globally
 >   recognized" methods.  OK, they can expand into strings or whatever,
 >   but then at least you get compile-time checking of correctness and
 >   maybe some space savings.  e.g.:
 > 
 > 	/* declare & init to "zero-block" in some file... */
 > 	extern const char dmover_function_zero_name[];
 > 	#define DMOVER_FUNCTION_ZERO	dmover_function_zero_name
 > 
 >   People don't have to use the #defines, but if they do then they know
 >   they probably got the function they wanted if it compiles.  8-)

Yah, I'll look at this, as well.

 > * the 'hwassist' specifications are ... lame.  Make the back-ends
 >   estimate their throughput in terms of bytes/sec (or, if you want to
 >   be modern, MB/sec).
 > 
 >   Provide a way for sessions to estimate throughput requirements.
 > 
 >   Try to do a bit more sane assignment based on throughput/bandwidth
 >   than it sounds like you're planning or can do w/ current interface.

The limitations are currently in the backend -> session interface.
This isn't really exposed to users of dmover/xform.  This means that
it's a little easier to fix in the future.

Anyway, I'll address the rest of this in a future version of the document.

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>