Re: [PATCH 01/11] random: don't feed stack data into pool when interrupt regs NULL

From: Sebastian Andrzej Siewior
Date: Fri Apr 04 2014 - 12:55:11 EST


On 2013-10-01 08:44:24 [-0400], Theodore Ts'o wrote:
> The changes queued for the next merge window in the random tree solve
> this problem slightly differently:
>
> ...
> input[0] = cycles ^ j_high ^ irq;
> input[1] = now ^ c_high;
> ip = regs ? instruction_pointer(regs) : _RET_IP_;
> input[2] = ip;
> input[3] = ip >> 32;
>
> fast_mix(fast_pool, input);
> ...
>
> (Note the lack of nbytes parameter in fast_mix --- there are some
> optimizations so that we mix in the changes by 32-bit words, instead
> of bytes, and the number of 32-bit words is fixed to 4, since it's the
> only way fast_mix is called).
>
> _RET_IP_ isn't that much better than 0, but it's at least kernel
> specific, and I figured it was better to shut up bogus warnings, as
> opposed to trying to depend on stack garbage (which would likely be
> kernel specific anyway).

That ip pointer was earlier optional. Now with _RET_IP_ it is a
constant since there is _one_ caller. Plus on 32bit the upper bits are
always zero. It probably didn't get worse because the four bytes on
stack were mostlikly constant as well. [2] is constant if _RET_IP_ is
used. The IP is kind of random. The lower bits are mostlikely 0 due to
32bit alignment (not on x86, okay).
Lets look further. c_high is != 0 only if cycles is larger than 4 bytes.
This is in most cases a long which makes 4 bytes on all 32bit arches.
This makes [1] the lower bytes of jiffies. And you can imagine how often
the upper 16bit change.
Which brings me to [0]. The irq number changes now and then and mostlikely
only the lower 8 bit. j_high is 0 on 32bit platforms. Even on 64bit
with HZ=250 the lower 32bit overflows every ~198 days unless I miscalculated.
Doesn't this make it a constant?
And finally cycles which is random_get_entropy(). On ARM (as previously on
MIPS) this returns 0. Well not always but for all platforms which do not
implement register_current_timer_delay() which makes a lot of them.
This makes
[0] = irq (8 bit)
[1] = jiffies
[2] = a constant unless regs is available and
[3] = 0

from looking at the code it reads like 16 random bytes are fed but now it
looks a little different.

May I kill this and save a few cycles in irq context?
Why don't you take a small amount of randomness and use that as a key and
feed into a block cipher in CTR mode instead of giving it to user right
away? This _can_ work in parallell and should provide *good* pseudo random
numbers on demand with a high performance.

But seriously. What about this: