Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()

From: Mika Westerberg
Date: Tue Sep 10 2024 - 05:13:44 EST


Hi,

On Mon, Sep 09, 2024 at 03:40:54PM -0500, Mario Limonciello wrote:
> > Few additional side paths here, though. This is supposed to work so that
> > once the host router sleep bit is set the driver is supposed to allow
> > the domain to enter sleep (e.g it should not be waken up before it is
> > fully transitioned). That's what we do:
> >
> > 1. All tunneled PCIe Root/Downstream ports are in D3.
> > 2. All tunneled USB 3.x ports are in U3.
> > 3. No DisplayPort is tunneled.
> > 4. Thunderbolt driver enables wakes.
> > 5. Thunderbolt driver writes sleep ready bit of the host router.
> > 6. Thunderbolt driver runtime suspend is complete.
> > 7. ACPI method is called (_PS3 or _PR3.OFF) that will trigger the "Sleep
> > Event".
> >
> > If between 5 and 7 there is device connected, it should not "abort" the
> > sequence. Unfortunately this is not explict in the USB4 spec but the
> > connection manager guide has similar note. Even if the connect happens
> > there the "Sleep Event" should happen but after that it can trigger
> > normal wakeup which will then bring everything back.
> >
> > Would it be possible to enable tracing around these steps so that we
> > could see if there is hotplug notification somewhere there that is not
> > expected? Here are instructions how to get pretty accurate trace:
> >
> > https://github.com/intel/tbtools?tab=readme-ov-file#tracing
> >
> > Please also take full dmesg.
>
> Sure, here is the dmesg with tracing enabled:
>
> https://gist.github.com/superm1/5186e0023c8a5d2ecd75c50fd2168308

Thanks! I meant also enable control channel tracing as explained in the
above linked page so we could maybe see the hotplug packet coming from
the lane adapter too. Anyhow,

[ 560.789681] thunderbolt 0-2: device disconnected
[ 560.789771] thunderbolt 0000:e5:00.5: bandwidth consumption changed, re-calculating estimated bandwidth
[ 560.789775] thunderbolt 0000:e5:00.5: bandwidth re-calculation done
[ 560.789778] thunderbolt 0000:e5:00.5: looking for DP IN <-> DP OUT pairs:
[ 560.789800] thunderbolt 0000:e5:00.5: 0:6: DP IN available
[ 560.789803] thunderbolt 0000:e5:00.5: 0:6: no suitable DP OUT adapter available, not tunneling
[ 560.790484] thunderbolt 0000:e5:00.5: 0:7: DP IN available
[ 560.790487] thunderbolt 0000:e5:00.5: 0:7: no suitable DP OUT adapter available, not tunneling
...
[ 578.677640] thunderbolt 0000:e5:00.5: nhi_runtime_suspend() - enter, pdev->current_state = 0
[ 578.677648] thunderbolt 0000:e5:00.5: 0: suspending switch
[ 578.677653] thunderbolt 0000:e5:00.5: 0: enabling wakeup: 0x3f
[ 578.678248] thunderbolt 0000:e5:00.5: stopping RX ring 0
[ 578.678256] thunderbolt 0000:e5:00.5: disabling interrupt at register 0x38200 bit 3 (0x9 -> 0x1)
[ 578.678272] thunderbolt 0000:e5:00.5: stopping TX ring 0
[ 578.678277] thunderbolt 0000:e5:00.5: disabling interrupt at register 0x38200 bit 0 (0x1 -> 0x0)
[ 578.678287] thunderbolt 0000:e5:00.5: control channel stopped
...
pci_pm_runtime_resume()
pci_pm_default_resume_early(pci_dev);
pci_pm_power_up_and_verify_state(pci_dev);
pci_power_up(pci_dev);
platform_pci_set_power_state(dev, PCI_D0);
// here we should see acpi_pci_set_power_state() printing "xxx: power
// state changed by ACPI to D0" but we don't.
//
// also pdev->current_state is set to PCI_D0
pci_update_current_state(pci_dev, PCI_D0);
// this one reads it back, it should be 0..
pm->runtime_resume(dev);
nhi_runtime_resume(dev)
[ 578.773503] thunderbolt 0000:e5:00.5: nhi_runtime_resume() - enter, pdev->current_state = 3

// .. but it is not. It seems to be powered off, D3cold.

[ 578.773516] thunderbolt 0000:e5:00.5: control channel starting...
[ 578.773518] thunderbolt 0000:e5:00.5: starting TX ring 0
[ 578.773524] thunderbolt 0000:e5:00.5: enabling interrupt at register 0x38200 bit 0 (0xffffffff -> 0xffffffff)

// As here too.

It would be interesting to see what ACPI does here, I mean enabling
dynamic debugging of acpi_device_set_power() and friends. One suspect is
that when the runtime suspends happens, the power resource or _PS3()
gets called to put the device into D3cold and it has some sort of
Sleep() inside. This allows the OS to relese the CPU to do other things
too according to ACPI spec (and Linux does this) so now when you plug in
the device the GPE triggers wake and Linux starts processing that one.
We see the above runtime resume but then the Sleep() in the _PS3()
completes and cuts the power from the NHI while we are in the middle of
resuming it already (or something along those lines).

We saw similar "context switch" happening when PCIe L2 failed, BIOS
implemented the timeout using Sleep() that allowed Linux to process
other things which resulted unexpected wake (not the same as here
though).

So one thing to check too is how the ACPI methods get called and
especially if they somehow end up messing with each other.

> > It is entirely possible that this has nothing to do with the issue but I
> > think it is worth checking.
> >
> > The second thing we could try is to check the wake status bits after
> > this has happened, like:
> >
> > # tbdump -r 0 -a <ADAPTER> -vv -N 1 PORT_CS_18
> >
> > (where <ADAPTER> is the lane 0 adapter of the USB4 port the device was
> > connected).
> >
>
> Unfortunately the adapter is in such a bad state at this time that tbdump
> doesn't work.

Oh, indeed it is.

> > The third thing to try is to comment out TB_WAKE_ON_CONNECT in
> > tb_switch_suspend(). This should result no wake even if the device is
> > connected. This tells us that it is really the connect on USB4 port that
> > triggered the wake.
>
> Yup that's correct; there is no action on the hotplug with this change.

Okay thanks for checking.