Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()

From: Mario Limonciello
Date: Wed Sep 04 2024 - 11:24:55 EST

Next message: Yury Norov: "Re: [RFC PATCH 2/2] sched: Improve cache locality of RSEQ concurrency IDs for intermittent workloads"
Previous message: Uros Bizjak: "Re: [PATCH v2] LoongArch/percpu: Simplify _percpu_read() and _percpu_write()"
In reply to: Mika Westerberg: "Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()"
Next in thread: Mika Westerberg: "Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 9/4/2024 07:05, Mika Westerberg wrote:

Hi,

On Tue, Sep 03, 2024 at 01:32:30PM -0500, Mario Limonciello wrote:

On 9/3/2024 13:25, Bjorn Helgaas wrote:

On Tue, Sep 03, 2024 at 12:31:00PM -0500, Mario Limonciello wrote:

On 9/3/2024 12:11, Bjorn Helgaas wrote:
...

8) The USB4 stack sees the device and assumes it is in D0, but it
seems to still be in D3cold. What is this based on? Is there a
config read that returns ~0 data when it shouldn't?

Yes there is. From earlier in the thread I have a [log] I shared.

The message emitted is from ring_interrupt_active():

"thunderbolt 0000:e5:00.5: interrupt for TX ring 0 is already enabled"

Right, that's in the cover letter, but I can't tell from this what the
ioread32(ring->nhi->iobase + reg) returned. It looks like this is an
MMIO read of BAR 0, not a config read.

Yeah. I suppose another way to approach this problem is to make something
else in the call chain poll PCI_PM_CTRL.

Polling at the start of nhi_runtime_resume() should also work. For the
"normal" scenario it would just be a single read to PCI_PM_CTRL.

Mika, thoughts?

We did this experiment to throw code to poll PCI_PM_CTRL at the start of nhi_runtime_resume() but this also fails. From that I would hypothesize the device transitioned to D0uninitialized sometime in the middle of pci_pm_runtime_resume() before the call to pm->runtime_resume(dev);

I'm starting to wonder if we are looking at the correct place ;-) This
reminds me that our PCIe SV people recently reported a couple of Linux
related issues which they recommended to fix, and these are on my list
but I'll share them because maybe they are related?

Thanks for sharing those. We had a try with them but sorry to say no improvements to the issue at hand.

First problem, and actually a PCI spec violation, is that Linux does not
clear Bus Master, MMIO and IO space enables when it programs the device
to D3 on runtime suspend path. It does so on system sleep path though.
Something like below (untested) should do that:

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index f412ef73a6e4..79a566376301 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1332,6 +1332,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
if (!pci_dev->state_saved) {
pci_save_state(pci_dev);
+ pci_pm_default_suspend(pci_dev);
pci_finish_runtime_suspend(pci_dev);
}

The second thing is that Thunderbolt driver, for historical reasons,
leaves the MSI enabled when entering D3. This too might be related. I
think we can unconditionally disable it so below hack should do that
(untested as well). I wonder if you could try if any of these or both
can help here? Both of these issues can result unwanted events during D3
entry as far as I understand.

diff --git a/drivers/thunderbolt/ctl.c b/drivers/thunderbolt/ctl.c
index dc1f456736dc..73b815fbbceb 100644
--- a/drivers/thunderbolt/ctl.c
+++ b/drivers/thunderbolt/ctl.c
@@ -659,12 +659,11 @@ struct tb_ctl *tb_ctl_alloc(struct tb_nhi *nhi, int index, int timeout_msec,
if (!ctl->frame_pool)
goto err;
- ctl->tx = tb_ring_alloc_tx(nhi, 0, 10, RING_FLAG_NO_SUSPEND);
+ ctl->tx = tb_ring_alloc_tx(nhi, 0, 10, 0);
if (!ctl->tx)
goto err;
- ctl->rx = tb_ring_alloc_rx(nhi, 0, 10, RING_FLAG_NO_SUSPEND, 0, 0xffff,
- 0xffff, NULL, NULL);
+ ctl->rx = tb_ring_alloc_rx(nhi, 0, 10, 0, 0, 0xffff, 0xffff, NULL, NULL);
if (!ctl->rx)
goto err;

Next message: Yury Norov: "Re: [RFC PATCH 2/2] sched: Improve cache locality of RSEQ concurrency IDs for intermittent workloads"
Previous message: Uros Bizjak: "Re: [PATCH v2] LoongArch/percpu: Simplify _percpu_read() and _percpu_write()"
In reply to: Mika Westerberg: "Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()"
Next in thread: Mika Westerberg: "Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]