Re: [PATCH] thunderbolt: Resume PCIe bridges after switch is found on AMD USB4 controller
From: Kai-Heng Feng
Date: Mon Sep 12 2022 - 03:35:48 EST
Hi,
On Thu, Sep 8, 2022 at 11:22 PM Limonciello, Mario
<mario.limonciello@xxxxxxx> wrote:
[snipped]
> > Nice to hear. Hopefully this can be fixed at firmware/hardware side.
>
> I guess you and Anson might want to sync up offline and compare whether
> you have the same hardware stepping.
Sure.
[snipped]
> > We used UEFI capsule to update the firmware, via fwupd.
>
> So that's a difference from how Anson did it. Could you perhaps dump
> the BIOS SPI image? Anson can flash the exact same dump via dediprog
> and see if he can repro.
>
> It would let us confirm if it was caused by your upgrade path.
OK, will dissuss this with AMD/HP offline.
>
> >
> >>
> >> 2) Did you change any BIOS settings? Particularly anything to do with Pre-OS CM?
> >
> > No, nothing in BIOS was changed. >
> >>
> >> 3) If you explicitly reset to HP's "default BIOS settings" does it resolve?
> >
> > Doesn't help. I put the device to ACPI G3 and it doesn't help, either.
>
> OK.
>
> >
> >>
> >> 4) Can you double check ADP_CS_5 bit 31? I attached is a patch to kernel Bugzilla to
> >> add dyndbg output for it. If it was for some reason set by Pre-OS CM in your BIOS/settings
> >> combination, we might need to undo it by the Linux CM.
> >
> > All ports say "Hotplug disabled: 0".
> >
> > dmesg attached to the bugzilla.
>
> OK, that at least rules out DHP from Pre-OS CM.
>
> >
> >>
> >> 5) Are you changing any of the default runtime PM policies for any of the USB4 routers or
> >> root ports used for tunneling using software like TLP?
> >
> > No. And they should be suspended by default.
> >
>
> Thinking about this being possibly a firmware upgrade path problem, can
> you please check:
>
> # grep SMC /sys/kernel/debug/dri/0/amdgpu_firmware_info
>
> Anson's system was 0x04453200 (program 4, version 69.50.0).
Exactly the same here.
Kai-Heng
>
> > Kai-Heng
> >
> >>
> >>>
> >>>>
> >>>> Bugzilla:
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> >>> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D216448&data=05%7C01%7Cm
> >>> ario.limonciello%40amd.com%7C1e27b1d6f69e42796c7b08da8f107121%7C3d
> >>> d8961fe4884e608e11a82d994e183d%7C0%7C0%7C637979598042186185%7CU
> >>> nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
> >>> 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0lhcaKfUyoK
> >>> 0FXT9uDZ8a%2Fpxs9tHd8aoQcyPFdB%2F0eY%3D&reserved=0
> >>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>
> >>>> ---
> >>>> drivers/thunderbolt/nhi.c | 29 +++++++++++++++++++++++++++++
> >>>> drivers/thunderbolt/switch.c | 6 ++++++
> >>>> drivers/thunderbolt/tb.c | 1 +
> >>>> drivers/thunderbolt/tb.h | 5 +++++
> >>>> include/linux/thunderbolt.h | 1 +
> >>>> 5 files changed, 42 insertions(+)
> >>>>
> >>>> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> >>>> index cb8c9c4ae93a2..75f5ce5e22978 100644
> >>>> --- a/drivers/thunderbolt/nhi.c
> >>>> +++ b/drivers/thunderbolt/nhi.c
> >>>> @@ -1225,6 +1225,8 @@ static int nhi_probe(struct pci_dev *pdev, const
> >>> struct pci_device_id *id)
> >>>> {
> >>>> struct tb_nhi *nhi;
> >>>> struct tb *tb;
> >>>> + struct pci_dev *p = NULL;
> >>>> + struct tb_pci_bridge *pci_bridge, *n;
> >>>> int res;
> >>>>
> >>>> if (!nhi_imr_valid(pdev)) {
> >>>> @@ -1306,6 +1308,19 @@ static int nhi_probe(struct pci_dev *pdev, const
> >>> struct pci_device_id *id)
> >>>> nhi_shutdown(nhi);
> >>>> return res;
> >>>> }
> >>>> +
> >>>> + if (pdev->vendor == PCI_VENDOR_ID_AMD) {
> >>>> + while ((p = pci_get_device(PCI_VENDOR_ID_AMD, 0x14cd,
> >>> p))) {
> >>>> + pci_bridge = kmalloc(sizeof(struct tb_pci_bridge),
> >>> GFP_KERNEL);
> >>>> + if (!pci_bridge)
> >>>> + goto cleanup;
> >>>> +
> >>>> + pci_bridge->bridge = p;
> >>>> + INIT_LIST_HEAD(&pci_bridge->list);
> >>>> + list_add(&pci_bridge->list, &tb->bridge_list);
> >>>> + }
> >>>> + }
> >>>
> >>> You can't walk the device tree and create a "shadow" list of devices
> >>> like this and expect any lifetime rules to work properly with them at
> >>> all.
> >>>
> >>> Please do not do this.
> >>>
> >>> greg k-h
>