Re: Entropy problem [was Re: CVS Problem (again) ,Slightly lesser old code, but old still [was Re: Console problem ,older code]]

To: Germain Le Chapelain <german%lanvaux.fr@localhost>
Subject: Re: Entropy problem [was Re: CVS Problem (again) ,Slightly lesser old code, but old still [was Re: Console problem ,older code]]
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Date: Sat, 5 Dec 2020 23:43:29 +0000

> Date: Sat, 5 Dec 2020 15:19:15 -0800
> From: Germain Le Chapelain <german%lanvaux.fr@localhost>
> 
> It's building firefox again,
> 
> I haven't yet tried any work-around mentioned in the email
> introducing the change But I am confused about the issue, why this
> arises.
> 
> I thought the way randomness was implemented is that you would
> initialize one seed-number to an amd64 and it would pass you back
> random #s, anytime and at any rate, *guaranteed* to be random,
> provided that for sure your initial # was random...

This is roughly right -- once you have a good seed stored in
/var/db/entropy-file, it serves to generate any amount of random
output, and it generally will be kept updated from boot to boot.

The issue is that certain software tries to avoid a security
catastrophe where you _haven't yet gotten a good enough seed_ before
you generate cryptographic keys that matter, like in
<https://factorable.net/>.

If your CPU has RDRAND/RDSEED, we assume it works (unless it gives
consecutive repeated 256-bit outputs, in which case we decide it's
broken), and nothing should ever block (unless you explicitly turn
that on for debugging purposes).  (If your CPU had RDRAND/RDSEED but
it's still blocking, let me know and we can try to diagnose that --
you can check with `cpuctl identify 0'.)

But if you don't have RDRAND/RDSEED, and you don't have a seed stored
on disk, and you don't have any other hardware RNG, and a program
running on NetBSD asks to _wait until it is seeded_ -- then, well, it
will hang.

In NetBSD<=9 (and in Linux and other systems), the kernel would stop
hanging after a while on the basis of a metric so meaningless it is
tantamount to a lie; NetBSD-current no longer incorporates that lie.
That said, in many contexts hanging is confusing and not helpful, and
I would like to address it better before we get to NetBSD 10 (e.g.,
<https://gnats.netbsd.org/55659> for some ideas).  But that's what it
does right now.

I'm not sure exactly what symptom you're experiencing at the moment,
but it might be compounded by an issue with Python, which is the
following:

1. Many years ago, in the early 2000s, Python added a function
   os.urandom(n) which returns a string of n bytes chosen uniformly at
   random _without ever blocking_ -- in other words, the contract was:
   if the system doesn't have a good enough seed, too bad; the caller
   of os.urandom(n) accepts responsibility for the security
   consequences (for example, the caller is leaving it to a system
   engineer to put the parts together so it's not an issue).

2. A lot of code was written using Python os.urandom(n), including the
   Python dict hash table randomization, the Python random module, and
   the Python multiprocessing module.

3. A few years ago, Guido deliberately changed the semantics of
   os.urandom(n) in Python by accepting PEP 524:

   https://www.python.org/dev/peps/pep-0524/
   https://mail.python.org/pipermail/security-sig/2016-August/000101.html

   Changing os.urandom(n) so it _does_ block, after over a decade of
   an API contract where it _never_ blocks, had the side effect that
   Python programs would almost all hang early at boot even if they
   don't appear to do any cryptography themselves.

4. To mitigate this symptom, Python was tweaked a tiny bit inside so
   that the random module and the dict hash table randomization would
   be initialized without blocking -- but _nothing else in Python_ has
   access to this path (without specifically opening /dev/urandom or
   similar).  That includes the multiprocessing module, which is used
   by, e.g., the build system meson.

Now we have a kind of nasty dilemma with Python:

- We could leave it as is, and Python will just block on some systems
  even if it's doing builds and not talking over the internet or
  anything.

- We could patch os.urandom(n) so that it returns to the old
  never-block semantics, and (a) risk factorable.net-type
  vulnerabilities in Python programs that assume os.urandom(n) _will_
  block until seeded, and (b) risk a bad reputation for breaking
  Python security like Debian did with OpenSSL.

Neither of these options is appealing -- I'm still working on finding
a way to finesse it that will work out better for everyone in the end.

References:
- Re: Entropy problem [was Re: CVS Problem (again) ,Slightly lesser old code, but old still [was Re: Console problem ,older code]]
  - From: Germain Le Chapelain

Prev by Date: Re: Entropy problem [was Re: CVS Problem (again) ,Slightly lesser old code, but old still [was Re: Console problem ,older code]]
Next by Date: daily CVS update output
Previous by Thread: Re: Entropy problem [was Re: CVS Problem (again) ,Slightly lesser old code, but old still [was Re: Console problem ,older code]]
Next by Thread: diskless sparc locks up under memory pressure
Indexes:

Home | Main Index | Thread Index | Old Index