Re: [PATCH] PCI/portdrv: Disallow runtime suspend when waekup is required but PME service isn't supported

From: Kai-Heng Feng
Date: Tue Aug 10 2021 - 11:37:43 EST


On Mon, Aug 9, 2021 at 11:00 PM Lukas Wunner <lukas@xxxxxxxxx> wrote:
>
> [cc += Rafael]
>
> On Mon, Aug 09, 2021 at 06:40:41PM +0800, Kai-Heng Feng wrote:
> > On Mon, Aug 9, 2021 at 5:47 PM Lukas Wunner <lukas@xxxxxxxxx> wrote:
> > > On Mon, Aug 09, 2021 at 12:24:12PM +0800, Kai-Heng Feng wrote:
> > > > Some platforms cannot detect ethernet hotplug once its upstream port is
> > > > runtime suspended because PME isn't enabled in _OSC.
> > >
> > > If PME is not handled natively, why does the NIC runtime suspend?
> > > Shouldn't this be fixed in the NIC driver by keeping the device
> > > runtime active if PME cannot be used?
> >
> > That means we need to fix every user of pci_dev_run_wake(), or fix the
> > issue in pci_dev_run_wake() helper itself.
>
> If PME is not granted to the OS, the only consequence is that the PME
> port service is not instantiated at the root port. But PME is still
> enabled for downstream devices. Maybe that's a mistake? I think the
> ACPI spec is a little unclear what to do if PME control is *not* granted.
> It only specifies what to do if PME control is *granted*:

So do you prefer to just disable runtime PM for the downstream device?

>
> "If the OS successfully receives control of this feature, it must
> handle power management events as described in the PCI Express Base
> Specification."
>
> "If firmware allows the OS control of this feature, then in the context
> of the _OSC method it must ensure that all PMEs are routed to root port
> interrupts as described in the PCI Express Base Specification.
> Additionally, after control is transferred to the OS, firmware must not
> update the PME Status field in the Root Status register or the PME
> Interrupt Enable field in the Root Control register. If control of this
> feature was requested and denied or was not requested, firmware returns
> this bit set to 0."
>
> Perhaps something like the below is appropriate, I'm not sure.
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 091b4a4..7e64185 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3099,7 +3099,7 @@ void pci_pm_init(struct pci_dev *dev)
> }
>
> pmc &= PCI_PM_CAP_PME_MASK;
> - if (pmc) {
> + if (pmc && pci_find_host_bridge(dev->bus)->native_pme) {
> pci_info(dev, "PME# supported from%s%s%s%s%s\n",
> (pmc & PCI_PM_CAP_PME_D0) ? " D0" : "",
> (pmc & PCI_PM_CAP_PME_D1) ? " D1" : "",
>
>

I think this will also prevent non-root port devices from using PME.

[snipped]

> >
> > I think PME IRQ and D3cold are different things here.
> > The root port of the affected NIC doesn't support D3cold because
> > there's no power resource.
>
> If a bridge is runtime suspended to D3, the hierarchy below it is
> inaccessible, which is basically the same as if it's put in D3cold,
> hence the name pci_dev_check_d3cold(). That function allows a device
> to block an upstream bridge from runtime suspending because the device
> is not allowed to go to D3cold. The function specifically checks whether
> a device is PME-capable from D3cold. The NIC claims it's capable but
> the PME event has no effect because PME control wasn't granted to the
> OS and firmware neglected to set PME Interrupt Enable in the Root Control
> register. We could check for this case and block runtime PM of bridges
> based on the rationale that PME polling is needed to detect wakeup.

So for this case, should we prevent the downstream devices from
runtime suspending, or let it suspend but keep the root port active in
order to make pci_pme_list_scan() work?

Kai-Heng

>
> Thanks,
>
> Lukas