tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Patch: new random pseudodevice



> In what sense are bits really ever "taken out"?

"Revealed to userland", of course.

The idea here is that entropy that has been revealed to userland might
as well not be present.  With good mixing at appropriate points, this
is of questionable truth, but it is, as you said, a very conservative
approach; it amounts to assuming userland has unlimited computational
power available to invert the mixing.  Combined with the conservative
approach to estimating how much entropy was put into the pool, it is a
reasonably good way of making sure that when you ask for strongly
random bits, you get strongly random bits uncorrelated with anyone
else's bits.  Your change loses this property, depending on something I
might call an "entropy stretcher", something which takes some number of
random bits and produces a much larger number of no-longer-random bits:
essentially, even the supposedly-strongly random device becomes just a
PRNG.  (A complex one among PRNGs, but still a PRNG.)

> If there is some kind of correlation between the bits you get from
> the pool now and the bits you got from the pool then, the right
> answer is not to put more bits in and hope the correlation gets
> worse; it is to correct the output function so that finding such a
> correlation is actually cryptographically hard.

It's true that better mixing on output is a good thing.

However, it does not fix the fundamental problem that you can't get out
more information than you put in.  Even a _good_ PRNG can't avoid
correlated output bits, even if the correlation is complex enough to be
"hard" to exploit.  You are replacing a very conservative and
well-defined concept (the amount of unrevealed information remaining in
the pool), even if a somewhat misleading term ("entropy") is used for
it, with a vague hope/belief that your PRNGs are hard to invert.

> Before, bits were "extracted" from the pool with a construct nobody
> had really studied, and we counted every bit output as if it had been
> somehow "consumed".  Even though we didn't actually understand what
> "consumed" meant.

Maybe you didn't.  I thought it was perfectly cleaer: information
exposed to userland reduces the amount of secret random information
content remaining in the pool.

In practice, I doubt your changes weaken it much...yet.

In theory, they are pretty horrible.  Information content is a fairly
well-defined concept, and the old code took a conservative approach to
measuring it and doling it out.  You are replacing that with something
that appears to think it can turn a small amount of information into a
large amount, which is not possible; the information content of the
output of your per-device PRNG cannot be more than the amount of
information you keyed it with, even if the correlations are currently
difficult to see.

I would welcome better mixing on output.  But this information
stretching for the supposedly-strongly-random device is, in my opinion,
just plain broken.

> And note that at least one highly-thought-of modern design for an
> entropy collector (Fortuna) doesn't even _try_ to keep an "entropy
> estimate"

Because one popular system makes a mistake, we should make the same
mistake?  (Actually, see my last paragraph, below.)

> -- the whole concept is pretty fuzzy when you start trying to count
> how many bits you "took out".

Not fuzzy at all.  Read "unrevealed information content" for "entropy"
and it amkes a whole lot of sense.  The number of bits you "took out"
is the number of bits of information revealed to elsewhere.  (The
amount of information content, not necessarily the number of bits of
apparent information - if you feed 32 bits of information into a hash
function, you get at most 32 bits of information content out, even if
they're spread across multiple hundreds of output bits.)

It's possible there's something going on here I don't understand, which
invalidates these arguments.  I'd welcome any pointers to such a thing.
But until then, I'm going to stick with the information-theoretical
point of view that you can't get more information out than you put in,
and call this "key a PRNG and then generate more bits of output than
there were in the key" implementation of the supposedly-strongly-random
device broken.

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index