Re: [PATCH] pinctrl: amd: Disable and mask interrupts on resume

From: Linux regression tracking (Thorsten Leemhuis)
Date: Tue Apr 11 2023 - 08:50:48 EST




On 10.04.23 17:29, Gong, Richard wrote:
> On 4/10/2023 12:03 AM, Mario Limonciello wrote:
>> On 3/20/23 04:32, Kornel Dulęba wrote:
>>
>>> This fixes a similar problem to the one observed in:
>>> commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on
>>> probe").
>>>
>>> On some systems, during suspend/resume cycle firmware leaves
>>> an interrupt enabled on a pin that is not used by the kernel.
>>> This confuses the AMD pinctrl driver and causes spurious interrupts.
>>>
>>> The driver already has logic to detect if a pin is used by the kernel.
>>> Leverage it to re-initialize interrupt fields of a pin only if it's not
>>> used by us.
>>>
>>> Signed-off-by: Kornel Dulęba <korneld@xxxxxxxxxxxx>
>>> ---
>>>   drivers/pinctrl/pinctrl-amd.c | 36 +++++++++++++++++++----------------
>>>   1 file changed, 20 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/pinctrl/pinctrl-amd.c
>>> b/drivers/pinctrl/pinctrl-amd.c
>>> index 9236a132c7ba..609821b756c2 100644
>>> --- a/drivers/pinctrl/pinctrl-amd.c
>>> +++ b/drivers/pinctrl/pinctrl-amd.c
>>> @@ -872,32 +872,34 @@ static const struct pinconf_ops amd_pinconf_ops
>>> = {
>>>       .pin_config_group_set = amd_pinconf_group_set,
>>>   };
>>>   -static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
>>> +static void amd_gpio_irq_init_pin(struct amd_gpio *gpio_dev, int pin)
>>>   {
>>> -    struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
>>> +    const struct pin_desc *pd;
>>>       unsigned long flags;
>>>       u32 pin_reg, mask;
>>> -    int i;
>>>         mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
>>>           BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
>>>           BIT(WAKE_CNTRL_OFF_S4);
>>>   -    for (i = 0; i < desc->npins; i++) {
>>> -        int pin = desc->pins[i].number;
>>> -        const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
>>> -
>>> -        if (!pd)
>>> -            continue;
>>> +    pd = pin_desc_get(gpio_dev->pctrl, pin);
>>> +    if (!pd)
>>> +        return;
>>>   -        raw_spin_lock_irqsave(&gpio_dev->lock, flags);
>>> +    raw_spin_lock_irqsave(&gpio_dev->lock, flags);
>>> +    pin_reg = readl(gpio_dev->base + pin * 4);
>>> +    pin_reg &= ~mask;
>>> +    writel(pin_reg, gpio_dev->base + pin * 4);
>>> +    raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
>>> +}
>>>   -        pin_reg = readl(gpio_dev->base + i * 4);
>>> -        pin_reg &= ~mask;
>>> -        writel(pin_reg, gpio_dev->base + i * 4);
>>> +static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
>>> +{
>>> +    struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
>>> +    int i;
>>>   -        raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
>>> -    }
>>> +    for (i = 0; i < desc->npins; i++)
>>> +        amd_gpio_irq_init_pin(gpio_dev, i);
>>>   }
>>>     #ifdef CONFIG_PM_SLEEP
>>> @@ -950,8 +952,10 @@ static int amd_gpio_resume(struct device *dev)
>>>       for (i = 0; i < desc->npins; i++) {
>>>           int pin = desc->pins[i].number;
>>>   -        if (!amd_gpio_should_save(gpio_dev, pin))
>>> +        if (!amd_gpio_should_save(gpio_dev, pin)) {
>>> +            amd_gpio_irq_init_pin(gpio_dev, pin);
>>>               continue;
>>> +        }
>>>             raw_spin_lock_irqsave(&gpio_dev->lock, flags);
>>>           gpio_dev->saved_regs[i] |= readl(gpio_dev->base + pin * 4)
>>> & PIN_IRQ_PENDING;
>>
>> Hello Kornel,
>>
>> I've found that this commit which was included in 6.3-rc5 is causing a
>> regression waking up from lid on a Lenovo Z13.
> observed "unable to wake from power button" on AMD based Dell platform.

This sounds like something that we want to fix quickly.

> Reverting "pinctrl: amd: Disable and mask interrupts on resume" on the
> top of 6.3-rc6 does fix the issue.
>>
>> Reverting it on top of 6.3-rc6 resolves the problem.
>>
>> I've collected what I can into this bug report:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=217315
>>
>> Linus Walleij,
>>
>> It looks like this was CC to stable.  If we can't get a quick solution
>> we might want to pull this from stable.
>
> this commit landed into 6.1.23 as well
>
>         d9c63daa576b2 pinctrl: amd: Disable and mask interrupts on resume

It made it back up to 5.10.y afaics.

The culprit has no fixes tag, which makes me wonder: should we quickly
(e.g. today) revert this in mainline to get back to the previous state,
so that Greg can pick up the revert for the next stable releases he
apparently currently prepares?

Greg, is there another way to make you quickly fix this in the stable
trees? One option obviously would be "revert this now in stable, reapply
it later together with a fix ". But I'm under the impression that this
is too much of a hassle and thus something you only do in dire situations?

I'm asking because I over time noticed that quite a few regressions are
in a similar situation -- and quite a few of them take quite some time
to get fixed even when a developer provided a fix, because reviewing and
mainlining the fix takes a week or two (sometimes more). And that is a
situation that is more and more hitting a nerve here. :-/

Ciao, Thorsten