Re: [PATCH v3 1/1] x86/platform/intel-mid: Retrofit pci_platform_pm_ops ->get_state hook

From: Andy Shevchenko
Date: Mon Oct 24 2016 - 05:19:51 EST


On Sun, 2016-10-23 at 16:57 +0200, Lukas Wunner wrote:
> On Sun, Oct 23, 2016 at 01:37:55PM +0100, Bryan O'Donoghue wrote:
> > On Sun, 2016-10-23 at 13:55 +0200, Lukas Wunner wrote:
> > > Commit cc7cc02bada8 ("PCI: Query platform firmware for device
> > > power
> > > state") augmented struct pci_platform_pm_ops with a ->get_state
> > > hook
> > > and
> > > implemented it for acpi_pci_platform_pm, the only
> > > pci_platform_pm_ops
> > > existing till v4.7.
> > >
> > > However v4.8 introduced another pci_platform_pm_ops for Intel
> > > Mobile
> > > Internet Devices with commit 5823d0893ec2 ("x86/platform/intel-
> > > mid:
> > > Add
> > > Power Management Unit driver").ÂÂIt is missing the ->get_state
> > > hook,
> > > which is fatal since pci_set_platform_pm() enforces its
> > > presence.ÂÂAndy
> > > Shevchenko reports that without the present commit, such a device
> > > "crashes without even a character printed out on serial console
> > > and
> > > reboots (since watchdog)".
> > >
> > > Retrofit mid_pci_platform_pm with the missing callback to fix the
> > > breakage.
> > >
> > > Fixes: cc7cc02bada8 ("PCI: Query platform firmware for device
> > > power
> > > state")
> > > Cc: x86@xxxxxxxxxx
> > > Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx>
> > > Acked-and-tested-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxx
> > > l.c
> > > om>
> > > Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > > ---
> > > Changes v1 -> v2:
> > > - Cast return value of intel_mid_pci_get_power_state() to
> > > Â (__force pci_power_t) to avoid "sparse -D__CHECK_ENDIAN__"
> > > warning.
> > > - Add ack by Andy Shevchenko.
> > >
> > > Changes v2 -> v3:
> > > - Amend commit message to explain the user-visible failure mode as
> > > Â reported by Andy.
> > > - Add ack by Bjorn Helgaas and Fixes tag.
> > >
> > > Âarch/x86/include/asm/intel-mid.hÂÂ|ÂÂ1 +
> > > Âarch/x86/platform/intel-mid/pwr.c | 19 +++++++++++++++++++
> > > Âdrivers/pci/pci-mid.cÂÂÂÂÂÂÂÂÂÂÂÂÂ|ÂÂ6 ++++++
> > > Â3 files changed, 26 insertions(+)
> > >
> > > diff --git a/arch/x86/include/asm/intel-mid.h
> > > b/arch/x86/include/asm/intel-mid.h
> > > index 5b6753d..49da9f4 100644
> > > --- a/arch/x86/include/asm/intel-mid.h
> > > +++ b/arch/x86/include/asm/intel-mid.h
> > > @@ -17,6 +17,7 @@
> > > Â
> > > Âextern int intel_mid_pci_init(void);
> > > Âextern int intel_mid_pci_set_power_state(struct pci_dev *pdev,
> > > pci_power_t state);
> > > +extern pci_power_t intel_mid_pci_get_power_state(struct pci_dev
> > > *pdev);
> > > Â
> > > Âextern void intel_mid_pwr_power_off(void);
> > > Â
> > > diff --git a/arch/x86/platform/intel-mid/pwr.c
> > > b/arch/x86/platform/intel-mid/pwr.c
> > > index 5d3b45a..67375dd 100644
> > > --- a/arch/x86/platform/intel-mid/pwr.c
> > > +++ b/arch/x86/platform/intel-mid/pwr.c
> > > @@ -272,6 +272,25 @@ int intel_mid_pci_set_power_state(struct
> > > pci_dev
> > > *pdev, pci_power_t state)
> > > Â}
> > > ÂEXPORT_SYMBOL_GPL(intel_mid_pci_set_power_state);
> > > Â
> > > +pci_power_t intel_mid_pci_get_power_state(struct pci_dev *pdev)
> > > +{
> > > + struct mid_pwr *pwr = midpwr;
> > > + int id, reg, bit;
> > > + u32 power;
> > > +
> > > + if (!pwr || !pwr->available)
> > > + return PCI_UNKNOWN;
> > > +
> > > + id = intel_mid_pwr_get_lss_id(pdev);
> > > + if (id < 0)
> > > + return PCI_UNKNOWN;
> > > +
> > > + reg = (id * LSS_PWS_BITS) / 32;
> > > + bit = (id * LSS_PWS_BITS) % 32;
> > > + power = mid_pwr_get_state(pwr, reg);
> > > + return (__force pci_power_t)((power >> bit) & 3);
> > > +}
> > > +
> > > Âvoid intel_mid_pwr_power_off(void)
> > > Â{
> > > Â struct mid_pwr *pwr = midpwr;
> > > diff --git a/drivers/pci/pci-mid.c b/drivers/pci/pci-mid.c
> > > index 55f453d..c7f3408 100644
> > > --- a/drivers/pci/pci-mid.c
> > > +++ b/drivers/pci/pci-mid.c
> > > @@ -29,6 +29,11 @@ static int mid_pci_set_power_state(struct
> > > pci_dev
> > > *pdev, pci_power_t state)
> > > Â return intel_mid_pci_set_power_state(pdev, state);
> > > Â}
> > > Â
> > > +static pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
> > > +{
> > > + return intel_mid_pci_get_power_state(pdev);
> > > +}
> > > +
> > > Âstatic pci_power_t mid_pci_choose_state(struct pci_dev *pdev)
> > > Â{
> > > Â return PCI_D3hot;
> > > @@ -52,6 +57,7 @@ static bool mid_pci_need_resume(struct pci_dev
> > > *dev)
> > > Âstatic struct pci_platform_pm_ops mid_pci_platform_pm = {
> > > Â .is_manageable = mid_pci_power_manageable,
> > > Â .set_state = mid_pci_set_power_state,
> > > + .get_state = mid_pci_get_power_state,
> > > Â .choose_state = mid_pci_choose_state,
> > > Â .sleep_wake = mid_pci_sleep_wake,
> > > Â .run_wake = mid_pci_run_wake,
> >
> > Shouldn't this serialize like this
> >
> > Â Â Â might_sleep();
> >
> > reg = (id * LSS_PWS_BITS) / 32;
> > bit = (id * LSS_PWS_BITS) % 32;
> >
> > Â Â Â mutex_lock(&pwr->lock);
> > ÂÂÂÂÂÂpower = mid_pwr_get_state(pwr, reg);
> > Â Â Â mutex_lock(&pwr->lock);
> >
> > return (__force pci_power_t)((power >> bit) & 3);
> >
> > there's a corresponding flow in mid_pwr_set_power_state() that
> > operates
> > in exactly that way.
>
> mid_pwr_set_power_state() uses a series of steps (set the power state,
> wait for completion) so presumably Andy thought this needs to be done
> under a lock to prevent concurrent execution.
>
> mid_pwr_get_state() on the other hand is just a register read, which
> I assume is atomic.ÂÂThe other stuff (calling
> intel_mid_pwr_get_lss_id(),
> calculation of reg and bit) seems to be static, it never changes
> across
> invocations.ÂÂHence there doesn't seem to be a necessity to acquire
> the mutex and call might_sleep().
>
> That said I'm not really familiar with these devices and rely on
> Andy's
> ack for correctness.ÂÂAndy if I'm mistaken please shout, otherwise I
> assume the patch is correct.

readl() is indeed atomic, the question is ordering of reads and writes,
but on this platform it's just an interface to PWRMU which is slow and
uses two sets of registers (one for read, one for write). Actual
operation happens after doorbell is written (with regard to PM_CMD
bits). So, there is a potential that read will return earlier state of
the device while PWRMU is processing new one, though I believe it's
prevented by PCI core.

>
> The usage of a mutex in mid_pwr_set_power_state() actually seems
> questionable since this is called with interrupts disabled:
>
> pci_pm_resume_noirq
> Â pci_pm_default_resume_early
> ÂÂÂÂpci_power_up
> ÂÂÂÂÂÂplatform_pci_set_power_state
> ÂÂÂÂÂÂÂÂmid_pci_set_power_state
> ÂÂÂÂÂÂÂÂÂÂintel_mid_pci_set_power_state
> ÂÂÂÂÂÂÂÂÂÂÂÂmid_pwr_set_power_state

Hmm... I have to look at this closer. I don't remember why I put mutex
in the first place there. Anyway it's another story.

--
Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
Intel Finland Oy