Re: [PATCH V3] firmware: google: Test spinlock on panic path to avoid lockups

From: Greg KH
Date: Thu Sep 01 2022 - 14:59:37 EST


On Thu, Sep 01, 2022 at 03:46:17PM -0300, Guilherme G. Piccoli wrote:
> On 01/09/2022 15:28, Greg KH wrote:
> > [...]
> >> I honestly didn't understand exactly what you're suggesting Greg...
> >> Mind clarifying?
> >
> > Something like this totally untested code:
> >
> > diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c
> > index adaa492c3d2d..6ad41b22671c 100644
> > --- a/drivers/firmware/google/gsmi.c
> > +++ b/drivers/firmware/google/gsmi.c
> > @@ -19,6 +19,7 @@
> > #include <linux/dma-mapping.h>
> > #include <linux/fs.h>
> > #include <linux/slab.h>
> > +#include <linux/panic.h>
> > #include <linux/panic_notifier.h>
> > #include <linux/ioctl.h>
> > #include <linux/acpi.h>
> > @@ -611,6 +612,11 @@ static const struct attribute *gsmi_attrs[] = {
> > NULL,
> > };
> >
> > +static bool panic_in_progress(void)
> > +{
> > + return unlikely(atomic_read(&panic_cpu) != PANIC_CPU_INVALID);
> > +}
> > +
> > static int gsmi_shutdown_reason(int reason)
> > {
> > struct gsmi_log_entry_type_1 entry = {
> > @@ -629,7 +635,8 @@ static int gsmi_shutdown_reason(int reason)
> > if (saved_reason & (1 << reason))
> > return 0;
> >
> > - spin_lock_irqsave(&gsmi_dev.lock, flags);
> > + if (!panic_in_progress())
> > + spin_lock_irqsave(&gsmi_dev.lock, flags);
> >
> > saved_reason |= (1 << reason);
> >
> > @@ -644,7 +651,8 @@ static int gsmi_shutdown_reason(int reason)
> >
> > rc = gsmi_exec(GSMI_CALLBACK, GSMI_CMD_SET_EVENT_LOG);
> >
> > - spin_unlock_irqrestore(&gsmi_dev.lock, flags);
> > + if (!panic_in_progress())
> > + spin_unlock_irqrestore(&gsmi_dev.lock, flags);
> >
> > if (rc < 0)
> > printk(KERN_ERR "gsmi: Log Shutdown Reason failed\n");
> >
> >
> >
>
> Thanks! Personally, I feel the approach a bit more complex than mine,
> and...racy!
> Imagine CPU0 runs your tests, right after the if (!panic_in_progress())
> is done, spinlock is taken and boom - panic on CPU1. This would cause
> the same issue...

True, it would, but so would yours if the unlock happens and then your
test passes and then this lock is taken and then a panic happens.

There's no "race free" way here perhaps. The joys of notifier chains (I
hate the things...)

> My approach is zero racy, since it checks if spinlock was taken in a
> moment that the machine is like a no-SMP, only a single CPU running...

Ah, I missed that this path is only called if an panic is happening.
Well, also a reboot.

Ick, I don't know, this all feels odd. I want someone else to review
this and give their ack on the patch before I'll take it so someone else
can share in the blame :)

thanks,

greg k-h