Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()

From: Linus Torvalds
Date: Thu Sep 19 2019 - 16:04:23 EST


On Thu, Sep 19, 2019 at 8:20 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Yes, it hashes it using a good hash, but it does so in a way that
> makes it largely possible to follow the hashing and repeat it and
> analyze it.
>
> That breaks if we have hw randomness, because it does the
>
> if (arch_get_random_long(&v))
> crng->state[14] ^= v;
>
> so it always mixes in hardware randomness as part of the extraction,
> but we don't mix anything else unpredictable - or even
> process-specific - state in.

So this is the other actual _serious_ patch I'd suggest: replace the

if (arch_get_random_long(&v))
crng->state[14] ^= v;

with

if (!arch_get_random_long(&v))
v = random_get_entropy();
crng->state[14] += v;

instead. Yeah, it still doesn't help on machines that don't even have
a cycle counter, but it at least means that you don't have to have a
CPU rdrand (or equivalent) but you do have a cycle counter, now the
extraction of randomness from the pool doesn't just do the
(predictable) mutation for the backtracking, but actually means that
you have some very hard to predict timing effects.

Again, in this case a cycle counter really does add a small amount of
entropy (everybody agrees that modern CPU's are simply too complex to
be predictable at a cycle level), but that's not really the point. The
point is that now doing the extraction really fundamentally changes
the state in unpredictable ways, so that you don't have that "if I
recognize a value, I know what the next value will be" kind of attack.

Which, as mentioned, is actually not a purely theoretical concern.

Note small detail above: I changed the ^= to a +=. Addition tends to
be better (due to carry between bits) when there might be bit
commonalities. Particularly with something like a cycle count where
two xors can mostly cancel out previous bits rather than move bits
around in the word.

With an actual random input from rdrand, the xor-vs-add is immaterial
and doesn't matter, of course, so the old code made sense in that
context.

In the attached patch I also moved the arch_get_random_long() and
random_get_entropy() to outside the crng spinlock. We're not talking
blocking operations, but it can easily be hundreds of cycles with
rdrand retries, or the random_get_entropy() reading an external clock
on some architectures.

Linus
diff --git a/drivers/char/random.c b/drivers/char/random.c
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1057,9 +1057,10 @@ static void _extract_crng(struct crng_state *crng,
(time_after(crng_global_init_time, crng->init_time) ||
time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
+ if (!arch_get_random_long(&v))
+ v = random_get_entropy();
spin_lock_irqsave(&crng->lock, flags);
- if (arch_get_random_long(&v))
- crng->state[14] ^= v;
+ crng->state[14] += v;
chacha20_block(&crng->state[0], out);
if (crng->state[12] == 0)
crng->state[13]++;