Re: [PATCH] random: avoid superfluous call to RDRAND in CRNG extraction

From: Jason A. Donenfeld
Date: Thu Dec 30 2021 - 17:58:13 EST


Hi Ted,

On 12/30/21, Theodore Ts'o <tytso@xxxxxxx> wrote:
> but realistically speaking, in
> crng_init_try_arch_early(), which gets called from rand_initialize(),
> we will have already set crng->state[4..15] via RDSEED or RDRAND.
>
> So there's no point in setting crng->state[0] from RDRAND. So if
> we're wanting to speed things up, we should just remove the
> crng->state[0] <= RDRAND entirely.

Good point, and that seems reasonable. I'll do that for v+1.

> Or if we want to improve the security of get_random_bytes() pre
> crng_ready(), then we should try to XOR RDRAND bytes into all returned
> buffer from get_random_bytes(). In other words, I'd argue that we
> should "go big, or go home". (And if we do have some real,
> security-critical users of get_random_bytes() pre-crng_ready(), maybe
> "go big" is the right way to go.

That's a decent way of looking at it. Rather than dallying with
32bits, we may as well go all the way. Or, to compromise on
efficiency, we could just xor in 16 or 32 bytes into the key rows
prior to each extraction. Alternatively, we have fewer things to think
about with the "go home" route, and then it's just a matter of
important users using get_random_bytes_wait(), which I think I mostly
took care of through the tree a few years back.

> So I'm not sure we how desperately we *need* the 370% performance improvement

It's not necessary (aside from, like, people using sendfile to erase
NVMes or something weird?), but it appeals to me for two reasons:
- The superfluous RDRAND with only 32bits really isn't doing much, and
having it there makes the design every so slightly more confusing and
less straightforward.
- I would like to see if at some point (not now, just in the future)
it's feasible, performance wise, to replace all of prandom with
get_batched_random() and company. I was on some thread a few years ago
where a researcher pointed out one place prandom was used when
get_random_u64() should have been, and in the ensuing discussion a few
more places were found with the same issue, and then more. And then
nobody could agree on whether the performance hit was worth it for
whichever security model. And in the end I don't recall anything
really happening. If that whole discussion could magicially go away
because we make all uses secure with no performance hit, it'd be a
major win against footguns like prandom. Maybe it won't be feasible in
the end, but simplifying a design in the process of seeing seems like
decent enough motivation.

Jason