Re: [PATCH v7 3/6] random: use SipHash in place of MD5

From: Andy Lutomirski
Date: Wed Dec 21 2016 - 21:10:21 EST


On Wed, Dec 21, 2016 at 6:07 PM, Hannes Frederic Sowa
<hannes@xxxxxxxxxxxxxxxxxxx> wrote:
> On 22.12.2016 00:42, Andy Lutomirski wrote:
>> On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
>>> unsigned int get_random_int(void)
>>> {
>>> - __u32 *hash;
>>> - unsigned int ret;
>>> -
>>> - if (arch_get_random_int(&ret))
>>> - return ret;
>>> -
>>> - hash = get_cpu_var(get_random_int_hash);
>>> -
>>> - hash[0] += current->pid + jiffies + random_get_entropy();
>>> - md5_transform(hash, random_int_secret);
>>> - ret = hash[0];
>>> - put_cpu_var(get_random_int_hash);
>>> -
>>> - return ret;
>>> + unsigned int arch_result;
>>> + u64 result;
>>> + struct random_int_secret *secret;
>>> +
>>> + if (arch_get_random_int(&arch_result))
>>> + return arch_result;
>>> +
>>> + secret = get_random_int_secret();
>>> + result = siphash_3u64(secret->chaining, jiffies,
>>> + (u64)random_get_entropy() + current->pid,
>>> + secret->secret);
>>> + secret->chaining += result;
>>> + put_cpu_var(secret);
>>> + return result;
>>> }
>>> EXPORT_SYMBOL(get_random_int);
>>
>> Hmm. I haven't tried to prove anything for real. But here goes (in
>> the random oracle model):
>>
>> Suppose I'm an attacker and I don't know the secret or the chaining
>> value. Then, regardless of what the entropy is, I can't predict the
>> numbers.
>>
>> Now suppose I do know the secret and the chaining value due to some
>> leak. If I want to deduce prior outputs, I think I'm stuck: I'd need
>> to find a value "result" such that prev_chaining + result = chaining
>> and result = H(prev_chaining, ..., secret);. I don't think this can
>> be done efficiently in the random oracle model regardless of what the
>> "..." is.
>>
>> But, if I know the secret and chaining value, I can predict the next
>> output assuming I can guess the entropy. What's worse is that, even
>> if I can't guess the entropy, if I *observe* the next output then I
>> can calculate the next chaining value.
>>
>> So this is probably good enough, and making it better is hard. Changing it to:
>>
>> u64 entropy = (u64)random_get_entropy() + current->pid;
>> result = siphash(..., entropy, ...);
>> secret->chaining += result + entropy;
>>
>> would reduce this problem by forcing an attacker to brute-force the
>> entropy on each iteration, which is probably an improvement.
>>
>> To fully fix it, something like "catastrophic reseeding" would be
>> needed, but that's hard to get right.
>
> I wonder if Ted's proposal was analyzed further in terms of performance
> if get_random_int should provide cprng alike properties?
>
> For reference: https://lkml.org/lkml/2016/12/14/351
>
> The proposal made sense to me and would completely solve the above
> mentioned problem on the cost of repeatedly reseeding from the crng.
>

Unless I've misunderstood it, Ted's proposal causes get_random_int()
to return bytes straight from urandom (effectively), which should make
it very strong. And if urandom is competitively fast now, I don't see
the problem. ChaCha20 is designed for speed, after all.