Re: [PATCH V3] firmware: google: Test spinlock on panic path to avoid lockups

From: Greg KH
Date: Thu Sep 01 2022 - 11:52:42 EST


On Fri, Aug 19, 2022 at 12:50:59PM -0300, Guilherme G. Piccoli wrote:
> Currently the gsmi driver registers a panic notifier as well as
> reboot and die notifiers. The callbacks registered are called in
> atomic and very limited context - for instance, panic disables
> preemption and local IRQs, also all secondary CPUs (not executing
> the panic path) are shutdown.
>
> With that said, taking a spinlock in this scenario is a dangerous
> invitation for lockup scenarios. So, fix that by checking if the
> spinlock is free to acquire in the panic notifier callback - if not,
> bail-out and avoid a potential hang.
>
> Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
> Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
> Cc: David Gow <davidgow@xxxxxxxxxx>
> Cc: Julius Werner <jwerner@xxxxxxxxxxxx>
> Reviewed-by: Evan Green <evgreen@xxxxxxxxxxxx>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
> ---
>
>
> This is a re-submission of the patch - it was in a series [0], but
> Greg suggested me to resubmit individually in order it gets picked
> by the relevant maintainers, instead of asking them to merge
> individual patches from a series. Notice I've trimmed a bit the CC
> list, it was bigger due to the patch being in a series...
>
> This is truly the V3 of the patch, below is the diff between versions:
>
> V3:
> - added Evan's review tag - thanks!
>
> V2:
> - do not use spin_trylock anymore, to avoid messing with
> non-panic paths; now we just check the spinlock state in
> the panic notifier before taking it. Thanks Evan for the review!
>
> [0] https://lore.kernel.org/lkml/20220719195325.402745-4-gpiccoli@xxxxxxxxxx/
>
>
> drivers/firmware/google/gsmi.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c
> index adaa492c3d2d..3ef5f3c0b4e4 100644
> --- a/drivers/firmware/google/gsmi.c
> +++ b/drivers/firmware/google/gsmi.c
> @@ -681,6 +681,14 @@ static struct notifier_block gsmi_die_notifier = {
> static int gsmi_panic_callback(struct notifier_block *nb,
> unsigned long reason, void *arg)
> {
> + /*
> + * Perform the lock check before effectively trying
> + * to acquire it on gsmi_shutdown_reason() to avoid
> + * potential lockups in atomic context.
> + */
> + if (spin_is_locked(&gsmi_dev.lock))
> + return NOTIFY_DONE;
> +

What happens if the lock is grabbed right after testing for it?
Shouldn't you use lockdep_assert_held() instead as the documentation
says to?


> gsmi_shutdown_reason(GSMI_SHUTDOWN_PANIC);

You are grabbing the lock way in this call, again, you have a window
where the check above would not have worked :(

I don't think this is fixing anything properly, sorry.

greg k-h