Re: random: /dev/random often returns short reads

From: Theodore Ts'o
Date: Wed Jan 18 2017 - 13:07:43 EST


On Wed, Jan 18, 2017 at 04:44:54PM +0100, Denys Vlasenko wrote:
> In this case, what code does is it returns fewer bytes,
> even though *it has enough random bytes to return the full request*.
>
> This happens because the patch which added more conservative
> accounting, while containing technically correct accounting per se,
> it forgot to take in the account another part of the code
> which was relying on the previous, simplistic logic
> "if we add N random bytes to a pool, now it has +N random bytes".
>
> In my opinion, it should have changed that part of code simultaneously
> with introducing more conservative accounting.

In the ideal world, yes. I've acknowledged this is a bug, in the "be
conservative in what you send, liberal in what you receive" sense..
But no one complained for three year, and userspace needs to be able
to retry short reads instead of immediately erroring out.

The problem is changing that code to figure out exactly how many bytes
you need to get in order to have N random bytes is non-trivial. So
our choices are:

1) Transfer more bytes than might be needed to the secondary pool,
which results in resource stranding --- since entropy in the secondary
pool isn't available for reseeding the CRNG. OTOH, given that we're
now using the CRNG solution, and we're only reseeding every five
minutes, I'm not actually all that worried about stranding some extra
entropy bits in the blocking pool, since that's only going to happen
if we have people *using* the /dev/random pool, and so that entropy
will likely be used eventually anyway.

2) Transfer bytes without using the conservative entropy "overwrite"
calculation if the blocking pool is mostly empty. This means we will
be over-estimating the entropy in that pool, which is unfortunate.
One could argue that all of the cases where people have been whining
about this, they are using HAVEGED which is providing pretend entropy
based on the presumed unpredictability of Intel cahce timing, so
careful entropy calculations is kind of silly anyway. However, there
might be some people who really are trying to do carefule entropy
measurement, so compromising this isn't really a great idea.

3) Make reads of /dev/random block, or have it try multiple times to
call extract_entropy() and transfering secondary entropy. This will
work, but it is less efficient, and it makes the code a bit uglier.

4) Close the bug as WAI, and tell Red Hat's paying customers to go
pound sand. Since they should be fixing their userspace, and why
changing OpenSSL to use /dev/random, but not changing it to fix the
damned bug in OpenSSL isn't OK, is crazy FIPS certification talk.

I'm leaning a bit towards 1 if we have to do something (which is my
proposed, untested patch).

- Ted