Re: Linux 5.3-rc8

From: Linus Torvalds
Date: Sat Sep 14 2019 - 12:30:48 EST


On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish <darwish.07@xxxxxxxxx> wrote:
>
> On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote:
> >
> > An alternative might be to make getrandom() just return an error
> > instead of waiting. Sure, fill the buffer with "as random as we can"
> > stuff, but then return -EINVAL because you called us too early.
>
> ACK, that's probably _the_ most sensible approach. Only caveat is
> the slight change in user-space API semantics though...
>
> For example, this breaks the just released systemd-random-seed(8)
> as it _explicitly_ requests blocking behvior from getrandom() here:
>

Actually, I would argue that the "don't ever block, instead fill
buffer and return error instead" fixes this broken case.

> => src/random-seed/random-seed.c:
> /*
> * Let's make this whole job asynchronous, i.e. let's make
> * ourselves a barrier for proper initialization of the
> * random pool.
> */
> k = getrandom(buf, buf_size, GRND_NONBLOCK);
> if (k < 0 && errno == EAGAIN && synchronous) {
> log_notice("Kernel entropy pool is not initialized yet, "
> "waiting until it is.");
>
> k = getrandom(buf, buf_size, 0); /* retry synchronously */
> }

Yeah, the above is yet another example of completely broken garbage.

You can't just wait and block at boot. That is simply 100%
unacceptable, and always has been, exactly because that may
potentially mean waiting forever since you didn't do anything that
actually is likely to add any entropy.

> if (k < 0) {
> log_debug_errno(errno, "Failed to read random data with "
> "getrandom(), falling back to "
> "/dev/urandom: %m");

At least it gets a log message.

So I think the right thing to do is to just make getrandom() return
-EINVAL, and refuse to block.

As mentioned, this has already historically been a huge issue on
embedded devices, and with disks turnign not just to NVMe but to
actual polling nvdimm/xpoint/flash, the amount of true "entropy"
randomness we can give at boot is very questionable.

We can (and will) continue to do a best-effort thing (including very
much using rdread and friends), but the whole "wait for entropy"
simply *must* stop.

> I've sent an RFC patch at [1].
>
> [1] 20190914122500.GA1425@darwi-home-pc">https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc

Looks reasonable to me. Except I'd just make it simpler and make it a
big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice().
Make it clear that it is a *bug* if user space thinks it should wait
at boot time.

Also, we might even want to just fill the buffer and return 0 at that
point, to make sure that even more broken user space doesn't then try
to sleep manually and turn it into a "I'll wait myself" loop.

Linus