tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Patch: cprng_fast performance - please review.



Miscellaneous notes -- I'm doing the benchmarking we discussed and
pondering whether it is right to switch from hc-128 to salsa20, separately.

On Thu, Apr 17, 2014 at 09:33:28PM +0000, Taylor R Campbell wrote:
> 
> > +void
> > +hc128_init(hc128_state_t *state, const uint8_t *key, const uint8_t *iv)
> > +{
> > +   unsigned int i;
> > +   uint32_t w[1280], *p = state->p, *q = state->q;
> 
> 5 KB on the stack is a lot!  Granted, this is a leaf routine which in
> our case will be called only in a softint handler, but still.

Fixed.

> For the purposes of a PRNG, conversion to little-endian is not
> necessary.  Would be nice for hc128_extract to just return uint32_t,
> too.

I'll benchmark this on a hardware, BE system (somehow; I don't have one
that's easy to get running again).  Whatever cipher we use for this I
would prefer to leave its core transform alone, for several reasons.

> Forward declaration of cprng_fast_randrekey should go among the other
> forward declarations, not mixed up among the global state.

Fixed.

> Can we put the creation of kern_cprng and the initialization of
> cprng_fast into cprng_init?  If not, there should be a comment
> explaining why not.

No -- it needs the kernel_cprng ready.  I added a comment.

> Why generate the IV randomly?  Why not a counter?  Does the IV have
> any requirements other than that it never be reused with the same key,
> i.e. is it anything other than a nonce?

Why not generate it randomly?  We hardly ever do it and extraction from
the kernel_cprng is cheap compared to the other keying operations.

> Are there actually any callers of cprng_fast at IPL_HIGH?  Are there
> actually any legitimate random decisions to be made at IPL_HIGH?  I'm
> sceptical.

What do you get if you cross an elephant and a rhinocerous?  Given that
what we do within the spl* takes very little time I am inclined to say
what spl we go to hardly matters, and be very conservative.  The real
question here, I think, is whether we should spl*() at all, or forbid
use of cprng_fast() from interrupt context entirely.

> > +size_t
> > +_cprng_fast_exact(void *p, size_t len)
> > +{
> 
> This should KASSERT(len <= CPRNG_MAX_LEN) or something so nobody is
> tempted to generate obscene amounts of data at once (at IPL_HIGH or
> IPL_VM or whatever it turns out to be).

Fixed.

> > +   ctx->numbytes += len;
> 
> Check for arithmetic overflow and saturate.

Can't get even close -- the max bytes on key is constrained to 2^29
on entry to the function.

I fixed _cprng_fast_inexact per your comments.

Thor


Home | Main Index | Thread Index | Old Index