Re: [PATCH v3 1/1] x86/platform/intel-mid: Retrofit pci_platform_pm_ops ->get_state hook

From: Andy Shevchenko
Date: Mon Oct 24 2016 - 07:05:55 EST


On Mon, 2016-10-24 at 12:09 +0200, Lukas Wunner wrote:
> On Mon, Oct 24, 2016 at 12:15:05PM +0300, Andy Shevchenko wrote:
> > On Sun, 2016-10-23 at 16:57 +0200, Lukas Wunner wrote:
> > > On Sun, Oct 23, 2016 at 01:37:55PM +0100, Bryan O'Donoghue wrote:
> > > > Shouldn't this serialize like this
> > > >
> > > > Â Â Â might_sleep();
> > > >
> > > > reg = (id * LSS_PWS_BITS) / 32;
> > > > bit = (id * LSS_PWS_BITS) % 32;
> > > >
> > > > Â Â Â mutex_lock(&pwr->lock);
> > > > ÂÂÂÂÂÂpower = mid_pwr_get_state(pwr, reg);
> > > > Â Â Â mutex_lock(&pwr->lock);
> > > >
> > > > return (__force pci_power_t)((power >> bit) & 3);
> > > >
> > > > there's a corresponding flow in mid_pwr_set_power_state() that
> > > > operates
> > > > in exactly that way.
> > >
> > > mid_pwr_set_power_state() uses a series of steps (set the power
> > > state,
> > > wait for completion) so presumably Andy thought this needs to be
> > > done
> > > under a lock to prevent concurrent execution.
> > >
> > > mid_pwr_get_state() on the other hand is just a register read,
> > > which
> > > I assume is atomic.ÂÂThe other stuff (calling
> > > intel_mid_pwr_get_lss_id(),
> > > calculation of reg and bit) seems to be static, it never changes
> > > across
> > > invocations.ÂÂHence there doesn't seem to be a necessity to
> > > acquire
> > > the mutex and call might_sleep().
> > >
> > > That said I'm not really familiar with these devices and rely on
> > > Andy's
> > > ack for correctness.ÂÂAndy if I'm mistaken please shout, otherwise
> > > I
> > > assume the patch is correct.
> >
> > readl() is indeed atomic, the question is ordering of reads and
> > writes,
> > but on this platform it's just an interface to PWRMU which is slow
> > and
> > uses two sets of registers (one for read, one for write). Actual
> > operation happens after doorbell is written (with regard to PM_CMD
> > bits). So, there is a potential that read will return earlier state
> > of
> > the device while PWRMU is processing new one, though I believe it's
> > prevented by PCI core.
>
> The corresponding functions in pci-acpi.c don't perform any locking,
> and AFAICS neither do the functions they call in drivers/acpi/.
>
> The power state is read and written from the various pci_pm_*
> callbacks
> and the PM core never executes those in parallel.
>
> However there's pci_set_power_state(), this is exported and called by
> various drivers, theoretically they would be able to execute that
> concurrently to a pci_pm_* callback, it would be silly though.
>
> Long story short, there's no locking needed unless you intend to call
> intel_mid_pci_set_power_state() from other places.ÂÂI guess that's
> what
> Bryan was alluding to when he wrote that the mutex might be "put in
> place to future-proof the code".ÂÂI note that you're exporting
> intel_mid_pci_set_power_state() even though there's currently no
> module
> user, so perhaps you're intending to call the function from somewhere
> else.

The export there is purely dictated by leaving abstract stuff under
drivers/pci when platform code is kept under arch/x86/platform. Other
than that there is no plans to call this outside of pci-mid.c.

>
>
> > >
> > > The usage of a mutex in mid_pwr_set_power_state() actually seems
> > > questionable since this is called with interrupts disabled:
> > >
> > > pci_pm_resume_noirq
> > > Â pci_pm_default_resume_early
> > > ÂÂÂÂpci_power_up
> > > ÂÂÂÂÂÂplatform_pci_set_power_state
> > > ÂÂÂÂÂÂÂÂmid_pci_set_power_state
> > > ÂÂÂÂÂÂÂÂÂÂintel_mid_pci_set_power_state
> > > ÂÂÂÂÂÂÂÂÂÂÂÂmid_pwr_set_power_state
> >
> > Hmm... I have to look at this closer. I don't remember why I put
> > mutex
> > in the first place there. Anyway it's another story.

There are two code paths
pci_power_up()
pci_platform_power_transition()

Second one can be called in non-atomic context for sure (consider
standard ->resume() callback).

First one runs when IRQ disabled on CPU side.

In any case we probably need to serialize access in our code to protect
against several PCI devices being powered up simultaneously.

Do you think we have to switch to spin_lock instead or remove it
completely?

--
Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
Intel Finland Oy