Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev_wait()
From: Mika Westerberg
Date: Fri Sep 13 2024 - 00:58:26 EST
Hi,
On Thu, Sep 12, 2024 at 11:12:11PM -0500, Mario Limonciello wrote:
> On 9/10/2024 04:13, Mika Westerberg wrote:
> > Hi,
> >
> > On Mon, Sep 09, 2024 at 03:40:54PM -0500, Mario Limonciello wrote:
> > > > Few additional side paths here, though. This is supposed to work so that
> > > > once the host router sleep bit is set the driver is supposed to allow
> > > > the domain to enter sleep (e.g it should not be waken up before it is
> > > > fully transitioned). That's what we do:
> > > >
> > > > 1. All tunneled PCIe Root/Downstream ports are in D3.
> > > > 2. All tunneled USB 3.x ports are in U3.
> > > > 3. No DisplayPort is tunneled.
> > > > 4. Thunderbolt driver enables wakes.
> > > > 5. Thunderbolt driver writes sleep ready bit of the host router.
> > > > 6. Thunderbolt driver runtime suspend is complete.
> > > > 7. ACPI method is called (_PS3 or _PR3.OFF) that will trigger the "Sleep
> > > > Event".
> > > >
> > > > If between 5 and 7 there is device connected, it should not "abort" the
> > > > sequence. Unfortunately this is not explict in the USB4 spec but the
> > > > connection manager guide has similar note. Even if the connect happens
> > > > there the "Sleep Event" should happen but after that it can trigger
> > > > normal wakeup which will then bring everything back.
> > > >
> > > > Would it be possible to enable tracing around these steps so that we
> > > > could see if there is hotplug notification somewhere there that is not
> > > > expected? Here are instructions how to get pretty accurate trace:
> > > >
> > > > https://github.com/intel/tbtools?tab=readme-ov-file#tracing
> > > >
> > > > Please also take full dmesg.
> > >
> > > Sure, here is the dmesg with tracing enabled:
> > >
> > > https://gist.github.com/superm1/5186e0023c8a5d2ecd75c50fd2168308
> >
> > Thanks! I meant also enable control channel tracing as explained in the
> > above linked page so we could maybe see the hotplug packet coming from
> > the lane adapter too. Anyhow,
> >
> > [ 560.789681] thunderbolt 0-2: device disconnected
> > [ 560.789771] thunderbolt 0000:e5:00.5: bandwidth consumption changed, re-calculating estimated bandwidth
> > [ 560.789775] thunderbolt 0000:e5:00.5: bandwidth re-calculation done
> > [ 560.789778] thunderbolt 0000:e5:00.5: looking for DP IN <-> DP OUT pairs:
> > [ 560.789800] thunderbolt 0000:e5:00.5: 0:6: DP IN available
> > [ 560.789803] thunderbolt 0000:e5:00.5: 0:6: no suitable DP OUT adapter available, not tunneling
> > [ 560.790484] thunderbolt 0000:e5:00.5: 0:7: DP IN available
> > [ 560.790487] thunderbolt 0000:e5:00.5: 0:7: no suitable DP OUT adapter available, not tunneling
> > ...
> > [ 578.677640] thunderbolt 0000:e5:00.5: nhi_runtime_suspend() - enter, pdev->current_state = 0
> > [ 578.677648] thunderbolt 0000:e5:00.5: 0: suspending switch
> > [ 578.677653] thunderbolt 0000:e5:00.5: 0: enabling wakeup: 0x3f
> > [ 578.678248] thunderbolt 0000:e5:00.5: stopping RX ring 0
> > [ 578.678256] thunderbolt 0000:e5:00.5: disabling interrupt at register 0x38200 bit 3 (0x9 -> 0x1)
> > [ 578.678272] thunderbolt 0000:e5:00.5: stopping TX ring 0
> > [ 578.678277] thunderbolt 0000:e5:00.5: disabling interrupt at register 0x38200 bit 0 (0x1 -> 0x0)
> > [ 578.678287] thunderbolt 0000:e5:00.5: control channel stopped
> > ...
> > pci_pm_runtime_resume()
> > pci_pm_default_resume_early(pci_dev);
> > pci_pm_power_up_and_verify_state(pci_dev);
> > pci_power_up(pci_dev);
> > platform_pci_set_power_state(dev, PCI_D0);
> > // here we should see acpi_pci_set_power_state() printing "xxx: power
> > // state changed by ACPI to D0" but we don't.
> > //
> > // also pdev->current_state is set to PCI_D0
> > pci_update_current_state(pci_dev, PCI_D0);
> > // this one reads it back, it should be 0..
> > pm->runtime_resume(dev);
> > nhi_runtime_resume(dev)
> > [ 578.773503] thunderbolt 0000:e5:00.5: nhi_runtime_resume() - enter, pdev->current_state = 3
> >
> > // .. but it is not. It seems to be powered off, D3cold.
> >
> > [ 578.773516] thunderbolt 0000:e5:00.5: control channel starting...
> > [ 578.773518] thunderbolt 0000:e5:00.5: starting TX ring 0
> > [ 578.773524] thunderbolt 0000:e5:00.5: enabling interrupt at register 0x38200 bit 0 (0xffffffff -> 0xffffffff)
> >
> > // As here too.
> >
> > It would be interesting to see what ACPI does here, I mean enabling
> > dynamic debugging of acpi_device_set_power() and friends.
>
> Here is a log with dyndbg turned up.
> https://gist.github.com/superm1/2f7ca6123431365d11e9bc3cc9329c14
>
> You can see firmware node \_SB_.PCI0.BUSC.NHI0 is physical device
> 0000:e5:00.5.
Thanks!
> > One suspect is
> > that when the runtime suspends happens, the power resource or _PS3()
> > gets called to put the device into D3cold and it has some sort of
> > Sleep() inside. This allows the OS to relese the CPU to do other things
> > too according to ACPI spec (and Linux does this) so now when you plug in
> > the device the GPE triggers wake and Linux starts processing that one.
> > We see the above runtime resume but then the Sleep() in the _PS3()
> > completes and cuts the power from the NHI while we are in the middle of
> > resuming it already (or something along those lines).
> >
> > We saw similar "context switch" happening when PCIe L2 failed, BIOS
> > implemented the timeout using Sleep() that allowed Linux to process
> > other things which resulted unexpected wake (not the same as here
> > though).
> >
> > So one thing to check too is how the ACPI methods get called and
> > especially if they somehow end up messing with each other.
>
> At least to me it looks pretty straightforward that each D0 sequence is
> going through and nothing looked out of order:
>
> acpi_pci_set_power_state()
> ->acpi_device_set_power()
> ->->acpi_dev_pm_explicit_set()
>
>
> There is a Sleep() call in M232() which is a SMI used in both _PS0 and _PS3,
> but as far as I can tell it's not actually called as the case always has
> zero as an argument.
>
> Here's some snippets of ASL from the NHI0 device to see.
> https://gist.github.com/superm1/32ef1f822a6220a41b19574024717f79
>
> I feel your theory is right about it being an aborted D3cold request, but I
> don't believe it's from concurrency issue of _PS3. I feel there isn't any
> guard rails either in Linux or the AML for ensuring that
> the transition actually finished.
Yeah, I agree now. It does not look like the methods are messing each
other here. We don't see the GPE handler being run but I don't think it
matters here. For some reason the device just is not yet ready when it
is supposed to be in D0.
Sorry for wasting your time with these suspects.