NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/54897 (ipsec tests now fail randomly on real hardware)



Synopsis: ipsec tests now fail randomly on real hardware

State-Changed-From-To: open->closed
State-Changed-By: riastradh%NetBSD.org@localhost
State-Changed-When: Thu, 15 Aug 2024 17:58:40 +0000
State-Changed-Why:
Here's a theory about what happened, in netbsd-9 and 9.99.x before the
entropy rework:

1. rnd_init
2. rump_hyperentropy_init -> rnd_attach_source
3. cprng_init
4. kern_cprng = cprng_strong_create
   (a) draws from entropy pool
   (b) requests samples from sources (rndsource callback)
5. cprng_fast_init
   -> draws from kern_cprng
6. rnd_init_softint
   -> enters samples from sources into pool
7. ifconfig shmifN create
   -> cprng_fast32
      (a) draws from cprng_fast
      (b) schedules softint to reseed cprng_fast from cprng_strong

The first call to gather a hyperentropy sample is at 4(b).  But, even
though netbsd-6 through netbsd-9 has a fancy notification system
(rndsinks) to actively trigger reseeding, cprng_fast doesn't use it and
the hyperentropy sample doesn't get incorporated into cprng_fast until
7(b).  So the only samples that affect shmif at 7(a) are weak timing
samples and similar.  Hence high collision probability.

Here's how it works differently in netbsd-10:

1. rnd_init
2. rump_hyperentropy_init -> rnd_attach_source
   -> request samples from sources (rndsource callback)
   -> enters samples from sources into pool
3. cprng_init
   -> kern_cprng = cprng_strong_create
   -> draws from entropy pool
4. (no separate step 4, kern_cprng creation happens inside cprng_init)
5. cprng_fast_init (doesn't draw anything)
6. rnd_init_softint
7. ifconfig shmifN create
   -> cprng_fast32
      (a) seeds cprng_fast with draw from cprng_strong
      (b) draws from cprng_fast

Note that in netbsd-10, except for samples entered in hard interrupt
context, rndsource samples are synchronously added the the pool -- and
that includes the samples entered by hyperentropy in the rndsource
callback.  So the first call to cprng_fast at 7(a) is always seeded
with hyperentropy samples.

So I think the underlying cause of this bug has been resolved in
netbsd-10 and current, but I don't think anyone has the appetite to
pull it up to netbsd-9.  Maybe there is a simpler change to netbsd-9
that would be worth applying, but the dependencies between all the
seeding components are very tricky and often lead to hard-to-diagnose
bugs early at boot when tweaked or reordered.

In any case, I think it is better to use cprng_strong here.  The names
are not apt, and if I were choosing them today, I would choose `cprng'
and `weakcprng', the idea being that you only use weakcprng if there is
a performance constraint that overrides security for some reason so you
can tolerate disclosure of past outputs.  For use in device attach or
creation, there is no such performance constraint, so cprng_strong is
the right choice.  So if anyone wants pullups, just changing to
cprng_strong is a better fix.





Home | Main Index | Thread Index | Old Index