Re: [PATCH] random: Fix the issue of '_might_sleep' function running in an atomic contex

From: Guoyong Wang
Date: Fri Apr 19 2024 - 04:41:16 EST


On Wed, 17 Apr 2024 14:01:11 +0200, Jason A. Donenfeld wrote:
> The entropy accounting changes a static key when the RNG has
> initialized, since it only ever initializes once. Static key changes,
> however, cannot be made from atomic context, so depending on where the
> last creditable entropy comes from, the static key change might need to
> be deferred to a worker.
>
> Previously the code used the execute_in_process_context() helper
> function, which accounts for whether or not the caller is
> in_interrupt(). However, that doesn't account for the case where the
> caller is actually in process context but is holding a spinlock.
>
> This turned out to be the case with input_handle_event() in
> drivers/input/input.c contributing entropy:
>
> [<ffffffd613025ba0>] die+0xa8/0x2fc
> [<ffffffd613027428>] bug_handler+0x44/0xec
> [<ffffffd613016964>] brk_handler+0x90/0x144
> [<ffffffd613041e58>] do_debug_exception+0xa0/0x148
> [<ffffffd61400c208>] el1_dbg+0x60/0x7c
> [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
> [<ffffffd613011294>] el1h_64_sync+0x64/0x6c
> [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
> [<ffffffd613102b54>] __might_sleep+0x44/0x7c
> [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
> [<ffffffd6132c2820>] static_key_enable+0x14/0x38
> [<ffffffd61400ac08>] crng_set_ready+0x14/0x28
> [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
> [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
> [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
> [<ffffffd613857e54>] add_input_randomness+0x38/0x48
> [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
> [<ffffffd613a81310>] input_event+0x6c/0x98
>
> According to Guoyong, it's not really possible to refactor the various
> drivers to never hold a spinlock there. And in_atomic() isn't reliable.
>
> So, rather than trying to be too fancy, just punt the change in the
> static key to a workqueue always. There's basically no drawback of doing
> this, as the code already needed to account for the static key not
> changing immediately, and given that it's just an optimization, there's
> not exactly a hurry to change the static key right away, so deferal is
> fine.
>
> Reported-by: Guoyong Wang <guoyong.wang@xxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
> Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> ---
> Guoyong- can you test this and tell me whether it fixes the problem you
> were seeing? If so, I'll try to get this sent up for 6.9. -Jason
>
> drivers/char/random.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 456be28ba67c..2597cb43f438 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
>
> static void __cold _credit_init_bits(size_t bits)
> {
> - static struct execute_work set_ready;
> + static DECLARE_WORK(set_ready, crng_set_ready);
> unsigned int new, orig, add;
> unsigned long flags;
>
> @@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
>
> if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
> crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
> - if (static_key_initialized)
> - execute_in_process_context(crng_set_ready, &set_ready);
> + if (static_key_initialized && system_unbound_wq)
> + queue_work(system_unbound_wq, &set_ready);
> atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
> wake_up_interruptible(&crng_init_wait);
> kill_fasync(&fasync, SIGIO, POLL_IN);
> @@ -890,8 +890,8 @@ void __init random_init(void)
>
> /*
> * If we were initialized by the cpu or bootloader before jump labels
> - * are initialized, then we should enable the static branch here, where
> - * it's guaranteed that jump labels have been initialized.
> + * or workqueues are initialized, then we should enable the static
> + * branch here, where it's guaranteed that these have been initialized.
> */
> if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
> crng_set_ready(NULL);
> --
> 2.44.0

Hi Jason,

Thanks for your feedback. We concur with the proposed change and have verified that it works well
in our tests. Next, I will provide a patch v2 for the changes discussed.