Re: [PATCH] random: avoid arch_get_random_seed_long() when collecting IRQ randomness
From: Ard Biesheuvel
Date: Wed Nov 11 2020 - 06:49:07 EST
On Wed, 11 Nov 2020 at 11:46, André Przywara <andre.przywara@xxxxxxx> wrote:
>
> On 11/11/2020 10:05, Ard Biesheuvel wrote:
>
> Hi,
>
> > On Wed, 11 Nov 2020 at 10:45, André Przywara <andre.przywara@xxxxxxx> wrote:
> >>
> >> On 11/11/2020 08:19, Ard Biesheuvel wrote:
> >>
> >> Hi,
> >>
> >>> (+ Eric)
> >>>
> >>> On Thu, 5 Nov 2020 at 16:29, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >>>>
> >>>> When reseeding the CRNG periodically, arch_get_random_seed_long() is
> >>>> called to obtain entropy from an architecture specific source if one
> >>>> is implemented. In most cases, these are special instructions, but in
> >>>> some cases, such as on ARM, we may want to back this using firmware
> >>>> calls, which are considerably more expensive.
> >>>>
> >>>> Another call to arch_get_random_seed_long() exists in the CRNG driver,
> >>>> in add_interrupt_randomness(), which collects entropy by capturing
> >>>> inter-interrupt timing and relying on interrupt jitter to provide
> >>>> random bits. This is done by keeping a per-CPU state, and mixing in
> >>>> the IRQ number, the cycle counter and the return address every time an
> >>>> interrupt is taken, and mixing this per-CPU state into the entropy pool
> >>>> every 64 invocations, or at least once per second. The entropy that is
> >>>> gathered this way is credited as 1 bit of entropy. Every time this
> >>>> happens, arch_get_random_seed_long() is invoked, and the result is
> >>>> mixed in as well, and also credited with 1 bit of entropy.
> >>>>
> >>>> This means that arch_get_random_seed_long() is called at least once
> >>>> per second on every CPU, which seems excessive, and doesn't really
> >>>> scale, especially in a virtualization scenario where CPUs may be
> >>>> oversubscribed: in cases where arch_get_random_seed_long() is backed
> >>>> by an instruction that actually goes back to a shared hardware entropy
> >>>> source (such as RNDRRS on ARM), we will end up hitting it hundreds of
> >>>> times per second.
> >>
> >> May I ask why this should be a particular problem? Form what I gathered
> >> on the web, it seems like most h/w RNGs have a capacity of multiple
> >> MBit/s. Wikipedia [1] suggests that the x86 CPU instructions generate at
> >> least 20 Mbit/s (worst case: AMD's 2500 cycles @ 800 MHz), and I
> >> measured around 78 Mbit/s with the raw entropy source on my Juno
> >> (possibly even limited by slow MMIO).
> >> So it seems unlikely that a few kbit/s drain the hardware entropy source.
> >>
> >> If we consider this interface comparably cheap, should we then not try
> >> to plug the Arm firmware interface into this?
> >>
> >
> > I'm not sure I follow. Are you saying we should not wire up a
> > comparatively expensive firmware interface to
> > arch_get_random_seed_long() because we currently assume it is backed
> > by something cheap?
>
> Yes. I wanted to (ab)use this patch to clarify this. x86 and arm64 use
> CPU instructions (so far), S390 copies from some buffer. PPC uses either
> a CPU instruction or an MMIO access. All of these I would consider
> comparably cheap, especially when compared to a firmware call with
> unknown costs. In fact the current Trusted Firmware implementation[1] is
> not really terse, also the generic SMC dispatcher calls a platform
> defined routine, which could do anything.
> So to also guide the implementation in TF-A, it would be good to
> establish what arch_get_random expects to be. The current
> implementations and the fact that it lives in a header file suggests
> that it's meant as a slim wrapper around something cheap.
>
Reseeding a random number generator is simply not something that you
should need to do hundreds of times per second, regardless of whether
its invocation is from a header file.
> > Because doing so would add significantly to the cost. Also note that a
> > firmware interface would permit other ways of gathering entropy that
> > are not necessarily backed by a dedicated high bandwidth noise source
> > (and we already have examples of this)
>
> Yes, agreed.
> So I have a hwrng driver for the Arm SMCCC TRNG interface ready. I would
> post this, but would like to know if we should drop the proposed
> arch_get_random implementation [2][3] of this interface.
>
No, that would defeat the whole point. There is no reason we should
wait for the entire driver stack to come up just to issue an
architected SMC call. This is basically the reason the spec got issued
in the first place.
> >> I am not against this patch, actually am considering this a nice
> >> cleanup, to separate interrupt generated entropy from other sources.
> >> Especially since we call arch_get_random_seed_long() under a spinlock here.
> >> But I am curious about the expectations from arch_get_random in general.
> >>
> >
> > I think it is reasonable to clean this up a little bit. A random
> > *seed* is not the same thing as a random number, and given that we
> > expose both interfaces, it makes sense to permit the seed variant to
> > be more costly, and only use it as intended (i.e., to seed a random
> > number generator)
>
> That's true, it seems we chickened out on the arm64 implementation
> already, by not using the intended stronger instruction for seed
> (RNDRRS), and not implementing arch_get_random_long() at all.
> But I guess that's another story.
>
Yes it is.