Re: [PATCH] random: avoid arch_get_random_seed_long() when collecting IRQ randomness

From: Ard Biesheuvel
Date: Mon Dec 07 2020 - 10:35:55 EST


On Mon, 7 Dec 2020 at 15:28, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
>
> Hi Ard,
>
> On Tue, Dec 1, 2020 at 1:24 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> > > > > is implemented. In most cases, these are special instructions, but in
> > > > > some cases, such as on ARM, we may want to back this using firmware
> > > > > calls, which are considerably more expensive.
>
> This seems fine. But I suppose I'm curious to learn more about what
> you have in mind for ARM. We've been assuming that arch_get_random is
> not horribly expensive. Usually external RNGs that are horribly
> expensive separate hardware take a different route and aren't hooked
> into arch_get_random. When you say "we may want to back this using
> firmware", does that mean it hasn't happened yet, and you're in a
> position to direct the design otherwise? If so, would it be reasonable
> to take a different route with that hardware, and keep arch_get_random
> for when it's actually implemented by the hardware? Or are there
> actually good reasons for keeping it one and the same?
>

Many older generation ARM SoCs have IP blocks that expose an entropy
source of some kind, and map it in the normal world, which is
accessible by the OS directly. These are driven as hwrngs via the
driver stack, which models them as actual devices, with clock and
regulator handling, power management hooks, etc etc.

There are multiple examples where such a SoC is being revved up with
newer cores etc, and now, the IP block is in the secure world, which
means the OS cannot access it directly, and needs to issue an SMC
instruction to perform a firmware call to access it. The secure world
firmware is now entirely in charge of the hardware, and so this SMC
call is really the only thing that goes on in this driver (no clocks,
regulators, etc)

So to prevent fragmentation, as well as make the entropy source
available much earlier in the boot, ARM has issued a firmware spec
that unifies these SMC calls, and defines them as non-blocking, i.e.,
return the requested number of entropy bits, or fail immediately.

Therefore, this should not be super expensive, given that the only
overhead is the CPU cycles spent on calling into the firmware (and a
bit of overhead perhaps from poking some MMIO registers in the IP
block). But it is definitely not suitable for being called hundreds of
times per second, hence this patch. (Note that we are talking about
arch_get_random_seed_long() here, not arch_get_random_long())

> On the other hand, rdrand on intel is getting slower and slower, to
> the point where we've removed it from a few places that used to use
> it. And I don't see anything terribly wrong with removing the extra
> call in this path here. So need be, I'll offer my Reviewed-by. But I
> wanted to get an understanding of the fuller story first.
>

Given that we already have both arch_get_random_seed_long() and
arch_get_random_long(), I think it is reasonable for the former to be
allowed to be slightly more expensive, and we should only invoke it
for the purpose of reseeding a pseudo-RNG. If this occurs on a hot
path, there is something terribly wrong already, so I don't think
RDRAND/RDSEED performance should be a huge concern here.

Note that once this patch lands, we also intend to change the current
way the arm64 RNDR and RNDRRS instructions are mapped to
arch_get_random_seed_long() and arch_get_random_long(). (RNDR returns
64-bit from a DRBG that reseeds an an implementation defined rate, and
RNDRRS does the same but forces a reseed to occur first)

Thanks,
Ard.