Re: [PATCH v2] PCI/ACPI: Don't assume D3 support if a device is power manageable

From: Mario Limonciello
Date: Mon Oct 24 2022 - 11:41:18 EST


On 10/22/22 12:20, Bjorn Helgaas wrote:
[+cc Lukas, in case you have comment on acpi_pci_power_manageable()]

There's a little bit of cognitive dissonance between the subject and
the comment line:

PCI/ACPI: Don't assume D3 support if a device is power manageable
+ /* Assume D3 support if the bridge is power-manageable by ACPI. */

I suggest tweaking the subject to mention the actual issue here. It
looks like it might be something to do with _S0W?


Ah, I changed the patch from when I first wrote the commit message.

When I resubmit I'll change the subject should to:

"PCI/ACPI: Validate devices with power resources support D3"

On Thu, Oct 20, 2022 at 03:11:11PM -0500, Mario Limonciello wrote:
On some firmware configurations where D3 is not supported for
"AMD Pink Sardine" the PCIe root port for tunneling will still be
opted into runtime PM since `acpi_pci_bridge_d3` returns true.

This paragraph sounds like it describes where you found the problem,
but I don't think it helps us understand what the problem *is* or how
to make sure the patch will work on other systems.

This later causes the device link between the USB4 router and the
PCIe root port for tunneling to misbehave. The USB4 router may
enter D3 via runtime PM, but the PCIe root port for tunneling
remains in D0. The expectation is the USB4 router should also
remain in D0 since the PCIe root port for tunneling remained in D0.

I'm not very familiar with device links. How does the link misbehave?
Is the link doing something wrong, or is it just that we're putting
one of the devices in the wrong power state?

Depending upon how the link is expressed will determine it's behavior. In this case the device link is between the USB4 router and the PCIe root port that is used for tunneling where the the PCIe root port for tunneling is a consumer.

When all the consumers enter runtime PM the USB4 router will also enter runtime PM.


I assume the USB4 router would be a descendant of the Root Port.
Generally descendants can be in lower-power states than their parents.
What expresses the constraint that the router must stay in D0 because
its parent is in D0?


Actually the topology that the root port used for tunneling and the root port for the USB4 router are siblings not parent/child.

The code that creates the link between the two lives in drivers/thunderbolt/acpi.c

Here is a small diagram showing it from a system with a dock connected:

├─ 0000:00:03.1
| | pcieport
| ├─ D0
| └─ 0000:03:00.0
| | pcieport
| ├─ D0
| ├─ 0000:04:02.0
| | | pcieport
| | ├─ D0
| | └─ 0000:05:00.0
| | | xhci_hcd
| | └─ D0
| └─ 0000:04:04.0
| | pcieport
| └─ D3hot
├─ 0000:00:04.1
| | pcieport
| └─ D3cold
├─ 0000:00:08.3
| | pcieport
| ├─ D0
| ├─ 0000:64:00.0
| | | xhci_hcd
| | └─ D0
| ├─ 0000:64:00.3
| | | xhci_hcd
| | └─ D3cold
| ├─ 0000:64:00.4
| | | xhci_hcd
| | └─ D3cold
| ├─ 0000:64:00.5
| | | thunderbolt
| | └─ D0
| └─ 0000:64:00.6
| | thunderbolt
| └─ D3cold

0000:00:04.1 and 0000:00:03.1 are PCIe root ports used for tunneling for USB4 routers at 0000:64:00.6 and 0000:64:00.5.

`acpi_pci_bridge_d3` has a number of checks, but starts out with an
assumption that if a device is power manageable introduced from
commit c6e331312ebf ("PCI/ACPI: Whitelist hotplug ports for D3 if power
managed by ACPI") that it will support D3. This is not a valid
assumption, as the PCIe root port for tunneling has power resources
but does not support D3hot or D3cold.

It looks like acpi_pci_power_manageable(dev) means "the device has
_PS0 or _PR0". Currently we assume that means we can put dev in D3.

And I think you're saying that assumption is a little bit too
aggressive because if _S0W says the device can't wake from D3hot or
D3cold, we should *not* use D3?

Exactly, it's too aggressive. This exact firmware configuration when brought to Windows will prohibit the PCIe root port for tunneling going into D3.


Instead of making this assertion from the power resources check
immediately, move the check to later on, which will have validated
that D3hot or D3cold can actually be reached.

IIUC the intervening code doesn't check whether D3hot/D3cold can be
*reached*, but whether the device can *wake* from D3hot/D3cold.

Right; but acpi_pci_bridge_d3 is used to decide whether the device is opted into runtime PM.


This fixes the USB4 router going into D3 when the firmware says that
the PCIe root port for tunneling can't handle it.

For maintenance purposes, I think it will be helpful to know
specifically which devices are involved (e.g., the PCI bus/device/fns
would show the PCI relationship) and how the firmware says the Root
Port can't handle D3. I assume this would be _S0W?

Yes, it's _S0W, which is why I have this marked as:

Fixes: dff6139015dc6

It's very similar to that, but in this case power resources are also declared.


Maybe even a pidgin example of the ACPI pieces involved here, e.g.,

RP01._PR0
RP01._S0W (0x0) # in S0, can wake from D0 only

Yeah; I think I can amend my diagram above with some of this information.


Fixes: dff6139015dc6 ("PCI/ACPI: Allow D3 only if Root Port can signal and wake from D3")
Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
---
v1->v2:
* Just return value of acpi_pci_power_manageable
* Remove extra word in commit message
---
drivers/pci/pci-acpi.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index a46fec776ad77..8c6aec50dd471 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -984,10 +984,6 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
if (acpi_pci_disabled || !dev->is_hotplug_bridge)
return false;
- /* Assume D3 support if the bridge is power-manageable by ACPI. */
- if (acpi_pci_power_manageable(dev))
- return true;
-
rpdev = pcie_find_root_port(dev);
if (!rpdev)
return false;
@@ -1023,7 +1019,8 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
obj->integer.value == 1)
return true;
- return false;
+ /* Assume D3 support if the bridge is power-manageable by ACPI. */
+ return acpi_pci_power_manageable(dev);
}
int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
--
2.34.1