Re: /dev/random is hot garbage

To: Martin Husemann <martin%duskware.de@localhost>
Subject: Re: /dev/random is hot garbage
From: Taylor R Campbell <campbell+netbsd-tech-kern%mumble.net@localhost>
Date: Sun, 21 Jul 2019 16:53:08 +0000
> Date: Sun, 21 Jul 2019 17:28:17 +0200
> From: Martin Husemann <martin%duskware.de@localhost>
> 
> Replacing the /dev/random device node by a symlink to /dev/urandom sounds
> fine. For binaries it is easy to just use the sysctl instead to get high
> quality randomness. Are there any shell script like applications that
> seriously would require something better than /dev/urandom?
> 
> The other issue is the urban rumour that you may want to pull a real random
> byte out of /dev/random before using /dev/urandom - maybe we should have
> a "aggregate" sysctl doing just that (so applications can get a single byte
> real entropy + as many /dev/urandom ones as they like in a single call)?

This is the correct way -- that works pretty reliably on almost any
platform that has /dev/u?random at all -- for a program to block until
the entropy pool has been seeded; there's essentially no other reason
ever to read from /dev/random.

What may not be clear is _which_ programs need to do this or when, and
the farther from the holistic view of system engineering you are, the
murkier it gets.

 * The system view.  Someone who is assembling a platform with
   pre-installed NetBSD to be shipped in a box and deployed needs to
   ensure, in the system they're shipping, that the entropy pool be
   seeded by an unpredictable secret _before_ you use any secrets
   derived from it, e.g. for cryptography.

   - A system engineer might choose hardware with a hardware RNG.

   - A system engineer might write an independent seed from their
     laptop to /var/db/entropy-file on each device (or cloud instance)
     after flashing it with the standard OS image.

   - A system engineer might
     (a) start a daemon in one rc script that reads a seed over a
         serial port to a Geiger counter with a radiation source,
         asynchronously; and
     (b) read a byte from /dev/random in another rc script that has to
         wait until the seeding daemon has done its job.

   - A system engineer might flip a coin 256 times, open a shell, and
     type `echo tthhhhhthhhththtt... > /dev/random', before starting
     any applications in a live system.

 * The application view.  Someone who writes an application, like a
   mail server, which might run in many different systems, needs a way
   to generate secrets that will be used for cryptography.  This is
   safe only after the entropy pool is seeded -- but the application
   engineer, who is just writing software, is not assembling the whole
   system and so can't arrange to set the application up next to a
   real radiation source and Geiger counter.

   - An application that has a definite startup phase, like the Postfix
     master daemon, might reasonably read a single byte from /dev/random
     at startup, and then use /dev/urandom in all its subprocesses.

   - An application might reasonably have a command-line argument for
     a seed file, which is also useful because it facilitates
     deterministic testing, like gcc's -frandom-seed.

   - An application might just quietly defer the decision to the
     system engineer, but the quieter this is, the greater the risk
     the application will be deployed with a fatal insecurity.

 * The library view.  Someone who writes a library used by many
   applications, like this Rust vendor/rand library, needs a way to
   get at secrets from the operating system that will be used to
   derive other secrets in the library.  This, again, is safe only
   after the entropy pool is seeded, but the library engineer has to
   make it usable in _many_ applications.

   A library might read from
   - /dev/random,
   - /dev/urandom,
   - getentropy(),
   - getrandom(),
   - sysctl kern.arandom,
   or something like that.  For example, our arc4random library reads
   from sysctl kern.arandom.  Alternatively, a library could accept a
   parameter -- to be passed by the application -- for a seed, and
   avoid talking to the operating system at all.

   Libraries may also be constrained by blocking: it is at least rude,
   and sometimes a fatal bug or a deadlock, for a library to block
   when it is expected not to block.  It may be especially bad if a
   library is expected not to block, but blocks _sometimes and only in
   extremely infrequent circumstances_, like how a POSIX clock skips a
   beat sometimes but only once every year or two (and simultaneously
   all over the world, when it does), making it unlikely that the code
   path will be exercised during tests.

   Some programs like gpg function more like libraries than like
   applications in that they are used as subroutines by other
   programs; the fact that gpg insists on reading every byte of every
   candidate RSA modulus from /dev/random has led to decades of
   justifiable frustration with using it as a subroutine.

 * The OS view.  Someone who writes an operating system used by many
   system engineers that run many applications, using many libraries,
   needs to provide interfaces for: a way to seed the entropy pool, a
   way to wait for the entropy pool to be seeded, a way to generate
   secrets.

   In NetBSD, we to seed the entropy pool, we have:
   - drivers for hardware RNG devices (and faux RNG devices like clocks),
   - boot loader support for loading a seed from disk,
   - /etc/rc.d support for loading a seed from disk, and
   - a writable /dev/random into which you can dump seed material.

   In NetBSD, to wait for the entropy pool to be seeded, we have:
   - a readable /dev/random which may block until something happens to
     bring the entropy pool over a threshold.

   In NetBSD, to generate secrets, we have:
   - a readable /dev/urandom,
   - sysctl kern.arandom / kern.urandom.

The Rust logic at issue in the tech-pkg@ thread is:

- inside a system, our bulk build process;
- inside an application, rustc and the Rust bootstrap build;
- inside a library, vendor/rand;
- running on NetBSD.

We could in principle resolve the problem any four of these levels --
system, application, library, and OS:

- Change the bulk build system.  We could replace /dev/random by a
  symlink to /dev/urandom, as Nia suggested on tech-pkg@, in the
  chroot or Xen guest where the builds happen.

  Provided that chroot or Xen guest is _not_ used for (e.g.) signing
  packages, key generation, sshd exposed to the internet, &c., and is
  limited only to building packages, this is safe.  Of course, it only
  addresses our bulk builders.  So, it's a _generally_ risky and
  limited change, but it would likely serve our needs here.

- Patch the Rust build.  Maybe we could patch rustc or the Rust
  bootstrap process to use a seed as a command-line argument, like
  gcc's -frandom-seed; as gdt mentioned on tech-pkg@, it is hard to
  imagine that it actually needs _secrets_.  Or maybe there's a way to
  do this already, e.g. for reproducible builds, and we could take
  advantage of that.

  (Conceivably it might use a hash table with a universal hash family,
  which does need a key that is unpredictable in advance, at least, to
  prevent a hash-flooding attack via Rust source code, but this is
  far-fetched.)

- Patch the Rust library.  We could -- and indeed, we apparently _do_
  -- patch the library vendor/rand so that it _does not_ read from
  /dev/random, and only uses /dev/urandom or (better) kern.arandom.

  But there's a risk here: applications may _rely_ on the library
  vendor/rand to block until the entropy pool is seeded.  So this
  change can introduce a vulnerability where there was none before.
  It's hard to imagine such a change could affect the build process,
  but it could certainly affect Rust applications, especially those
  deployed in appliances, without careful system engineering.

  I don't know what assumptions downstream consumers of this library
  make.  Maybe this is fine -- maybe blocking on /dev/random is
  actually a bug, because _every_ application using it actually
  ensures the entropy pool is seeded some other way, or doesn't care.
  But the fact that the logic was here to begin with makes me
  suspicious that it is actually important to block until seeded.

- Change NetBSD.  We could change NetBSD's interfaces: /dev/random,
  /dev/urandom, kern.arandom; maybe add Linux getrandom(2), OpenBSD
  getentropy(2).

  It has become popular to redefine the traditional semantics of
  /dev/random or /dev/urandom so that one or both will block once at
  boot until the OS thinks the entropy pool may have been seeded, and
  then never block again.

  I don't want to do this because code paths that may block but only
  in extreme circumstances, like early at boot on an embedded system,
  are likely never to be exercised even during what might otherwise be
  extensive testing, and as noted blocking when not expected can have
  severe consequences.

  But maybe it would not be so bad to do this in a new interface like
  getentropy(2), much as I think it is a bad idea to establish such
  extremely unlikely blocking behaviour in an API, since there is now
  lots of existing code that uses exactly that interface.

  Of course, that doesn't help with building Rust on netbsd-8.
Follow-Ups:
- Re: /dev/random is hot garbage
  - From: coypu
References:
- Re: /dev/random is hot garbage
  - From: Martin Husemann
Prev by Date: Re: /dev/random is hot garbage
Next by Date: Re: /dev/random is hot garbage
Previous by Thread: Re: /dev/random is hot garbage
Next by Thread: Re: /dev/random is hot garbage
Indexes:
Home | Main Index | Thread Index | Old Index