Re: [PATCH] USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up

From: duanchenghao
Date: Thu Oct 10 2024 - 01:59:41 EST


Hi Alan,

在 2024-10-09星期三的 11:50 -0400,Alan Stern写道:
> > On Wed, Oct 09, 2024 at 10:35:05AM +0800, duanchenghao wrote:
> > > > Hi Alan,
> > > >
> > > > These are two patches, each addressing the same issue. The
> > > > current
> > > > patch is a direct solution to the problem of the interrupt
> > > > bottom
> > > > half
> > > > being frozen. The patch you replied with is another,
> > > > alternative
> > > > solution to the same problem. Please review which solution is
> > > > more
> > > > suitable, or if there are any other revised proposals.
> > > >
> > > >
> > > > Please review the patch I mentioned:
> > > > https://lore.kernel.org/all/0a4dc46ae767c28dd207ae29511ede747f05539a.camel@xxxxxxxxxx/
> >
> > I still think your whole approach is wrong.  There is no need to
> > change
> > the way the HCD_FLAG_WAKEUP_PENDING flag gets set or cleared;
> > that's
> > not
> > the reason for the problem you're seeing.
> >

Thank you very much for your evaluation of the scheme. I have a
question regarding why the set_bit operation for the
HCD_FLAG_WAKEUP_PENDING flag is performed in the top half of an
interrupt handler, while the clear_bit operation is done in the bottom
half. This seems to contradict conventional practices. The issue is not
limited to S4; if other processes freeze the work queue in the bottom
half, the same problem may arise.

The solution you described below should be able to resolve the current
S4 issue, but for now, we haven't identified any other scenarios that
require the same operation. Based on my understanding, the USB device
is woken up in the bottom half of the interrupt, and both the set_bit
and clear_bit operations for the HCD_FLAG_WAKEUP_PENDING flag should be
executed within the same thread in the bottom half. May I ask if there
are any other reasons why the set_bit is executed in the top half?

Thanks
Duanchenghao

> > The problem occurs because when suspend_common() in
> > drivers/usb/core/hcd-pci.c sets do_wakeup, it does so by calling
> > device_may_wakeup(), and the value that function returns is not
> > what
> > we
> > want.  The value is based on whether the controller's power/wakeup
> > attribute is set, but we also have to take into account the type of
> > suspend that's happening.
> >
> > Basically, if msg is one of the suspend types for which wakeups
> > should
> > always be disabled, then do_wakeup should be set to false.  There
> > isn't
> > a macro to test for these things, but there should be.  Something
> > like
> > PMSG_IS_AUTO() in include/linux/pm.h; maybe call it
> > PMSG_NO_WAKEUP(). 
> > This macro should return true if the PM_EVENT_FREEZE or
> > PM_EVENT_QUIESCE
> > bits are set in msg.event.
> >
> > Rafael, please correct me if I got any of this wrong.
> >
> > So the right way to fix the problem is to add to pm.h:
> >
> > #define PMSG_NO_WAKEUP(msg)     (((msg.event) & \
> >                 (PM_EVENT_FREEZE | PM_EVENT_QUIESCE)) != 0)
> >
> > and in suspend_common():
> >
> >         if (PMSG_IS_AUTO(msg))
> >                 do_wakeup = true;
> >         else if (PMSG_NO_WAKEUP(msg))
> >                 do_wakeup = false;
> >         else
> >                 do_wakeup = device_may_wakeup(dev);
> >
> > Then with do_wakeup set to false, none of the HCD_WAKEUP_PENDING()
> > tests
> > in the following code will be executed, so the routine won't fail
> > during
> > the freeze or restore phase with -EBUSY.
> >
> > You'll also have to define an hcd_pci_freeze() routine, just
> > like hcd_pci_suspend() except that it uses PMSG_FREEZE instead of
> > PMSG_SUSPEND.  And the .freeze callback in usb_hcd_pci_pm_ops
> > should
> > be
> > changed to hcd_pci_freeze.
> >
> > In fact, it looks like choose_wakeup() in drivers/usb/core/driver.c
> > could also use the new macro, instead of checking for FREEZE or
> > QUIESCE
> > directly the way it does now.
> >
> > Alan Stern