Re: [BUG] ASUS ProArt PX13 HN7306WU: amd_pmc s2idle S0ix corrupts AMD 1022:150b root port, NVIDIA dGPU returns header type 7f

From: Mario Limonciello

Date: Fri Apr 03 2026 - 09:23:56 EST


On 4/2/26 4:32 PM, Joyful Lee wrote:
On Thu, Apr 2, 2026 at 5:14 PM Mario Limonciello
<mario.limonciello@xxxxxxx> wrote:
Can you please explain what root port corruption means? Are you
actually meaning graphical corruption?

A few things I want to point out after resume that might be relevant.

> pci 0000:c4:00.0: Unable to change power state from D0 to D0, device
inaccessible
> snd_hda_intel 0000:c4:00.1: Unable to change power state from D3hot
to D0, device inaccessible

Both of these devices are part of the discrete GPU (VGA and Audio class
endpoints). So neither came back to the right state, which to me has me
wondering if the EC is cutting a power rail used for the discrete GPU
and it's coming back at the wrong time.

It wouldn't be the first time that the kernel raced with firmware and lost.

What happens if you pass a PCIe reset to the root port above them after
resume? Does it recover?

Corruption is just the best word I have to describe the way the dGPU
bridge (00:03.1) comes back in an unexpected state after resume and the
two devices under it (the NVIDIA GPU c4:00.0 and its audio device
c4:00.1) become unresponsive. The two lspci attachments on my original
post really demonstrate that weird state.

Got it.


Indeed, the "Unable to change power state from D0/D3hot/D3cold to D0,
device inaccessible" messages are the main symptom of the problem. I'm
purposefully avoiding the nouveau and nvidia drivers at this point to
reduce confusion, but the problem happens even when they are loaded, and
both of them crash upon resume.

Thanks - I agree with removing drivers from the equation to look at the behavior of bridges and root ports.

It's unfortunate that turning off pcie_port_pm didn't help either.


When I issue a reset to the root port after resume, it does not recover.
The kernel prints the "unable to change power state" messages again and
the devices remain unresponsive.

Joy

To your questions in the original post (again sorry for the original knee jerk reaction):

1) Yes semantics of how AMD-PMC works are correct for this platform. There are lots of other Strix designs that do everything correctly.
2) Yes; there are FW dependencies that are design specific.
3) There are no quirks for working around this bug you've found right now.

I believe this issue you're seeing is likely a platform firmware bug. Whether it's AGESA or it's ASUS firmware is unclear right now.

I have not seen this specific symptom you mention on other systems (including AMD reference systems) so I am /leaning/ more on ASUS firmware.

I notice that the BIOS you have from ASUS is quite old:

> bios_date| 03/18/2025

Any chance there is a BIOS update available?
Unfortunately ASUS doesn't always make flashing in Linux easy, but hopefully you can find a way to flash the update if one is available.