Re: [PATCH] random: Fix the issue of '_might_sleep' function running in an atomic contex
From: Jason A. Donenfeld
Date: Fri Apr 19 2024 - 04:56:01 EST
On Fri, Apr 19, 2024 at 04:41:12PM +0800, Guoyong Wang wrote:
> On Wed, 17 Apr 2024 14:01:11 +0200, Jason A. Donenfeld wrote:
> > The entropy accounting changes a static key when the RNG has
> > initialized, since it only ever initializes once. Static key changes,
> > however, cannot be made from atomic context, so depending on where the
> > last creditable entropy comes from, the static key change might need to
> > be deferred to a worker.
> >
> > Previously the code used the execute_in_process_context() helper
> > function, which accounts for whether or not the caller is
> > in_interrupt(). However, that doesn't account for the case where the
> > caller is actually in process context but is holding a spinlock.
> >
> > This turned out to be the case with input_handle_event() in
> > drivers/input/input.c contributing entropy:
> >
> > [<ffffffd613025ba0>] die+0xa8/0x2fc
> > [<ffffffd613027428>] bug_handler+0x44/0xec
> > [<ffffffd613016964>] brk_handler+0x90/0x144
> > [<ffffffd613041e58>] do_debug_exception+0xa0/0x148
> > [<ffffffd61400c208>] el1_dbg+0x60/0x7c
> > [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
> > [<ffffffd613011294>] el1h_64_sync+0x64/0x6c
> > [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
> > [<ffffffd613102b54>] __might_sleep+0x44/0x7c
> > [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
> > [<ffffffd6132c2820>] static_key_enable+0x14/0x38
> > [<ffffffd61400ac08>] crng_set_ready+0x14/0x28
> > [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
> > [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
> > [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
> > [<ffffffd613857e54>] add_input_randomness+0x38/0x48
> > [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
> > [<ffffffd613a81310>] input_event+0x6c/0x98
> >
> > According to Guoyong, it's not really possible to refactor the various
> > drivers to never hold a spinlock there. And in_atomic() isn't reliable.
> >
> > So, rather than trying to be too fancy, just punt the change in the
> > static key to a workqueue always. There's basically no drawback of doing
> > this, as the code already needed to account for the static key not
> > changing immediately, and given that it's just an optimization, there's
> > not exactly a hurry to change the static key right away, so deferal is
> > fine.
> >
> > Reported-by: Guoyong Wang <guoyong.wang@xxxxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
> > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> > ---
> > Guoyong- can you test this and tell me whether it fixes the problem you
> > were seeing? If so, I'll try to get this sent up for 6.9. -Jason
> >
> > drivers/char/random.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > index 456be28ba67c..2597cb43f438 100644
> > --- a/drivers/char/random.c
> > +++ b/drivers/char/random.c
> > @@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
> >
> > static void __cold _credit_init_bits(size_t bits)
> > {
> > - static struct execute_work set_ready;
> > + static DECLARE_WORK(set_ready, crng_set_ready);
> > unsigned int new, orig, add;
> > unsigned long flags;
> >
> > @@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
> >
> > if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
> > crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
> > - if (static_key_initialized)
> > - execute_in_process_context(crng_set_ready, &set_ready);
> > + if (static_key_initialized && system_unbound_wq)
> > + queue_work(system_unbound_wq, &set_ready);
> > atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
> > wake_up_interruptible(&crng_init_wait);
> > kill_fasync(&fasync, SIGIO, POLL_IN);
> > @@ -890,8 +890,8 @@ void __init random_init(void)
> >
> > /*
> > * If we were initialized by the cpu or bootloader before jump labels
> > - * are initialized, then we should enable the static branch here, where
> > - * it's guaranteed that jump labels have been initialized.
> > + * or workqueues are initialized, then we should enable the static
> > + * branch here, where it's guaranteed that these have been initialized.
> > */
> > if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
> > crng_set_ready(NULL);
> > --
> > 2.44.0
>
> Hi Jason,
>
> Thanks for your feedback. We concur with the proposed change and have verified that it works well
> in our tests. Next, I will provide a patch v2 for the changes discussed.
No need. This is already upstream and has your Reported-by. It'll get
backported to stable too.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e871abcda3b67d0820b4182ebe93435624e9c6a4