Re: early x86 unseeded randomness

From: Ingo Molnar
Date: Tue Aug 15 2017 - 04:06:00 EST



* Willy Tarreau <w@xxxxxx> wrote:

> On Tue, Aug 15, 2017 at 09:42:54AM +0200, Ingo Molnar wrote:
> >
> > * Willy Tarreau <w@xxxxxx> wrote:
> >
> > > Nowadays we could use similar methods using RDTSC providing more accurate
> > > counting. This doesn't provide a lot of entropy of course, given that a
> > > 2 GHz machine will at most count 31 bits there. But I tend to think that
> > > what matters during early boot is to transform something highly predictable
> > > into something unlikely to be predicted (ie: an exploit having to scan 2^31
> > > possible addresses will not be really usable). It's also possible to do the
> > > same with the PIT0 counter ticking at 18.2 Hz without any correlation with
> > > the RTC by the way, and roughly provide 25 more bits. And if you expect
> > > that the BIOS has emitted a 800 Hz beep at boot, you could still have a
> > > divider of 1491 in PIT2 providing 10 more bits, though with a bit of
> > > correlation with PIT0 since they use the same 1.19 MHz source. These
> > > methods increase the boot time by up to one second though, but my point
> > > here is that when you have nothing it's always a bit better.
> >
> > One other thing besides trying to extract entropy via timing would be to utilize
> > more of the machine's environment in seeding the random number generator.
> >
> > For example on x86 the E820 table is available very early on and its addresses
> > could be mixed into the random pool. An external attacker often would not know the
> > precise hardware configuration.
> >
> > Likewise the boot parameters string could be mixed into the initial random pool as
> > well - and this way distributions could create per installation seed simply by
> > appending a random number to the boot string.
> >
> > Both methods should be very fast and robust.
>
> Definitely, just like a simple MD5SUM on the first MB of RAM including
> the BIOS, and on the CMOS RAM contents, which also differ quite a bit
> between systems.

In practice it's much faster to process the e820 table and the boot string though
- and should already give a way to gain very good inter-machine randomization. It
would also cover odd things like weird machines such as virtualization
environments that often have nothing in the first 1MB.

I.e. 'e820' and 'boot string' are two universally available pieces of
environmental data that could be used in a robust and fast fashion.

Thanks,

Ingo