Re: [PATCH v3 2/2] random: defer fast pool mixing to worker

From: Eric Biggers
Date: Tue Feb 08 2022 - 19:12:39 EST


On Mon, Feb 07, 2022 at 04:39:14PM +0100, Jason A. Donenfeld wrote:
> +static void mix_interrupt_randomness(struct work_struct *work)
> +{
> + struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
> +
> + /*
> + * Since this is the result of a trip through the scheduler, xor in
> + * a cycle counter. It can't hurt, and might help.
> + */
> + fast_pool->pool[3] ^= random_get_entropy();
> +
> + mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
> + /* We take care to zero out the count only after we're done reading the pool. */
> + WRITE_ONCE(fast_pool->count, 0);
> + fast_pool->last = jiffies;
> + credit_entropy_bits(1);
> +}

So, add_interrupt_randomness() can execute on the same CPU re-entrantly at any
time this is executing? That could result in some pretty weird behavior, where
the pool gets changed half-way through being used, so what is used is neither
the old nor the new state of the pool. Is there a reason why this is okay?

> void add_interrupt_randomness(int irq)
> {
> struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness);
> struct pt_regs *regs = get_irq_regs();
> unsigned long now = jiffies;
> cycles_t cycles = random_get_entropy();
> + unsigned int new_count;
> u32 c_high, j_high;
> u64 ip;
>
> @@ -999,9 +1012,10 @@ void add_interrupt_randomness(int irq)
>
> fast_mix(fast_pool);
> add_interrupt_bench(cycles);
> + new_count = ++fast_pool->count;
>
> if (unlikely(crng_init == 0)) {
> - if ((fast_pool->count >= 64) &&
> + if (new_count >= 64 &&
> crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {
> fast_pool->count = 0;
> fast_pool->last = now;
> @@ -1009,20 +1023,16 @@ void add_interrupt_randomness(int irq)
> return;
> }
>
> - if ((fast_pool->count < 64) && !time_after(now, fast_pool->last + HZ))
> + if (new_count & FAST_POOL_MIX_INFLIGHT)
> return;
>
> - if (!spin_trylock(&input_pool.lock))
> + if (new_count < 64 && !time_after(now, fast_pool->last + HZ))
> return;
>
> - fast_pool->last = now;
> - __mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
> - spin_unlock(&input_pool.lock);
> -
> - fast_pool->count = 0;
> -
> - /* award one bit for the contents of the fast pool */
> - credit_entropy_bits(1);
> + if (unlikely(!fast_pool->mix.func))
> + INIT_WORK(&fast_pool->mix, mix_interrupt_randomness);
> + fast_pool->count |= FAST_POOL_MIX_INFLIGHT;
> + queue_work_on(raw_smp_processor_id(), system_highpri_wq, &fast_pool->mix);
> }

Is there a reason why the FAST_POOL_MIX_INFLIGHT is part of 'count' instead of a
separate boolean?

Also, a high level question. Now that the call to mix_pool_bytes() would no
longer occur in hard IRQ context, how much reason is there to minimize the
amount of data passed to it? Would it be feasible to just concatenate the
interrupt data into an array, and pass the whole array to mix_pool_bytes() --
eliminating the homebrew ARX thing entirely?

- Eric