Re: [REGRESSION][BISECTED] "xHCI host controller not responding, assume dead" on stable kernel > 6.8.7

From: Limonciello, Mario
Date: Tue May 21 2024 - 06:51:37 EST




On 5/21/2024 3:55 AM, Mika Westerberg wrote:
Hi,

On Tue, May 21, 2024 at 10:07:23AM +0200, Gia wrote:
Thank you Mika,

Here you have the output of sudo journalctl -k without enabling the
kernel option "pcie_aspm=off": https://codeshare.io/7JPgpE. Without
"pcie_aspm=off", "thunderbolt.host_reset=false" is not needed, my
thunderbolt dock does work. I also connected a 4k monitor to the
thunderbolt dock thinking it could provide more data.

I'm almost sure I used this option when I set up this system because
it solved some issues with system suspending, but it happened many
months ago.

Okay. I recommend not to use it. The defaults should always be the best
option (unless you really know what you are doing or working around some
issue).

Windows and Linux handle port pm differently at suspend. I've had a few patch series attempts to allow unifying them with some "smaller" pieces landing as well as a quirk for one of the root ports.

But the specific issue that was happening was a platform bug that occurred due to this. It's since then been fixed, and I guess you have a new BIOS Gia.

Completely agree with Mika the default policy for Linux is generally right though.


The dmesg you shared looks good, there are few oddities but they should
not matter from functional perspective (unless you are planning to have
a second monitor connected).

First is this:

May 21 09:59:40 um773arch kernel: thunderbolt 0000:36:00.5: IOMMU DMA protection is disabled

It should really be enabled but I'm not familiar with AMD hardware to
tell more so hoping Mario can comment on that.

This is controlled by OEM BIOS policy.
You should try to turn it on if you can as it's a more secure setup.
Some of the Linux stack (for example bolt) will automatically authorize PCIe and TBT3 devices when it's deemed secure.

I'm not familiar with the OEM for your machine, but some strings you can look for that might point you in that direction to enable it:

1) "Kernel DMA protection"
2) "Security levels"

I know some OEMs also only enable it when you "load optimized defaults".


The second thing is the USB4 link that seems to be degraded to 2x10G =
20G even though you say it is a Thunderbolt cable. I'll comment more on
that in the other email.