Re: [PATCH 1/2] x86/random: Retry on RDSEED failure

From: H. Peter Anvin
Date: Tue Feb 06 2024 - 14:12:59 EST


On February 3, 2024 6:35:47 AM PST, Theodore Ts'o <tytso@xxxxxxx> wrote:
>On Fri, Feb 02, 2024 at 10:28:01PM +0100, James Bottomley wrote:
>>
>> My big concern is older cpus where rdrand/rdseed don't produce useful
>> entropy. Exhaustion attacks are going to be largely against VMs not
>> physical systems, so I worry about physical systems with older CPUs
>> that might have rdrand issues which then trip our Confidential
>> Computing checks.
>
>For (non-CC) VM's the answer is virtio-rng. This solves the
>exhaustion problem, since if you can't trust the host, the VM's
>security is taost anyway (again, ignoring Confidential Compute).
>
>> The signal for rdseed failing is fairly clear, so if the node has other
>> entropy sources, it should continue otherwise it should signal failure.
>> Figuring out how a confidential computing environment signals that
>> failure is TBD.
>
>That's a design decision, and I believe we've been converging on a
>panic during early boot. Post boot, if we've successfully succeeded
>in initializing the guest kernel's RNG, we're secure so long as the
>cryptographic primitives haven't been defeated --- and if we have,
>such as if Quantuum Computing because practical, we've got bigger
>problems anyway.
>
> - Ted

I also want to emphasize that there is a huge difference between boot (initialization) time and runtime. Runtime harvesting has always been opportunistic in Linux, and so if RDSEED fails, try again later – unless perhaps a task is blocked on /dev/random in which case it might make sense to aggressively loop on the blocked core instead of just putting the process to sleep.

Initialization time is a different game entirely. Until we have accumulated about 256-512 bits of seed data, even the best PRNG can't really be considered "completely random." Thus a far more aggressive approach may be called for; furthermore, this is the time to look for total failure of the NRBG if after some number N attempts (where I believe N should be quite large, if we spend a full second in the very worst case that is probably better than declaring failure and optionally panic the system) we have not acquired enough entropy then warn and optionally panic the system.

By setting the limit in terms of time rather than iterations, this avoids the awkward issue of "the interface to the RDSEED unit is too fast and so it returns failure too often." I don't think anyone would argue that the right thing would be to slow down the response time of RDSEED for that reason, even though it would most likely radically reduce the failure rate (because the NRBG would have more time to produce entropy between queries at the maximum rate.)

Let's say, entirely hypothetically (as of right now I have absolutely *no* insider information of the RNG unit roadmap), that we were to implement a prefetch buffer in the core, such that a single or a handful of RD* instructions could execute in a handful of cycles, with the core itself issuing the request to the RNG unit when there is space in the queue. Such a prefetch buffer could rather obviously get *very* quickly exhausted because the poll rate could be dramatically increased, and having the core stall until there is data may or may not be a good solution (advantage: the CPU can go to a lower power state while waiting; disadvantage: opportunistic harvesting would prefer a "poll and fail fast" variation, *especially* if the CPU is going to fulfill the request autonomously anyway.)