Re: [PATCH] random: ensure mix_interrupt_randomness() is consistent

From: Sebastian Andrzej Siewior
Date: Fri Feb 11 2022 - 03:16:15 EST


On 2022-02-11 02:14:46 [+0100], Jason A. Donenfeld wrote:
> Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Theodore Ts'o <tytso@xxxxxxx>
> Cc: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
> Cc: Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx>
> Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> ---
> drivers/char/random.c | 42 ++++++++++++++++++++++++++++++++++--------
> 1 file changed, 34 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 9c779f1bda34..caaf3c33bb38 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1210,14 +1211,39 @@ static u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
> static void mix_interrupt_randomness(struct work_struct *work)
> {
> struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
> - u32 pool[ARRAY_SIZE(fast_pool->pool32)];
> + unsigned long pool[ARRAY_SIZE(fast_pool->pool_long)];
> + unsigned int count_snapshot;
> + size_t i;
>
> - /* Copy the pool to the stack so that the mixer always has a consistent view. */
> - memcpy(pool, fast_pool->pool32, sizeof(pool));
> + /* Check to see if we're running on the wrong CPU due to hotplug. */
> + migrate_disable();
> + if (fast_pool != this_cpu_ptr(&irq_randomness)) {

I am not sure that acquire and release semantic is needed and if so a
comment would probably be helpful to explain why.
But I'm trying to avoid the migrate_disable(), so:
To close the racy with losing the workqueue bit, wouldn't it be
sufficient to set it to zero via atomic_cmpxchg()? Also if the counter
before the memcpy() and after (at cmpxchg time) didn't change then the
pool wasn't modified. So basically

do {
counter = atomic_read(&fast_pool->count); // no need to cast
memcpy(pool, fast_pool->pool_long, ARRAY_SIZE(pool));
} while (atomic_cmpxchg(&fast_pool->count, counter, 0) != counter);


then it also shouldn't matter if we are _accidentally_ on the wrong CPU.

> + migrate_enable();
> + /*
> + * If we are unlucky enough to have been moved to another CPU,
> + * then we set our count to zero atomically so that when the
> + * CPU comes back online, it can enqueue work again.
> + */
> + atomic_set_release(&fast_pool->count, 0);
> + return;
> + }
> +
> + /*
> + * Copy the pool to the stack so that the mixer always has a
> + * consistent view. It's extremely unlikely but possible that
> + * this 2 or 4 word read is interrupted by an irq, but in case
> + * it is, we double check that count stays the same.
> + */
> + do {
> + count_snapshot = (unsigned int)atomic_read(&fast_pool->count);
> + for (i = 0; i < ARRAY_SIZE(pool); ++i)
> + pool[i] = READ_ONCE(fast_pool->pool_long[i]);

Why do you avoid memcpy()? Since it is a small memcpy, I'm sure the
compile will inline the register moves.

Sebastian