Re: [BUG] ASUS ProArt PX13 HN7306WU: amd_pmc s2idle S0ix corrupts AMD 1022:150b root port, NVIDIA dGPU returns header type 7f
From: Mario Limonciello
Date: Thu Apr 02 2026 - 17:15:06 EST
On 4/2/26 3:59 PM, Joyful Lee wrote:
[You don't often get email from joy@xxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
On Thu, Apr 2, 2026 at 3:58 PM Mario Limonciello
<mario.limonciello@xxxxxxx> wrote:
What does corruption mean in your context? Because I don't have any
idea what a root port being corrupted means.
Basically I don't trust a single assumption that was made by that LLM.
Can you please get an amd-s2idle report from an unpatched mainline kernel:
https://origin.kernel.org/doc/html/next/arch/x86/amd-debugging.html#s2idle-debugging-tool
*Then* let's talk like people used to before LLMs came into the picture
about your problem.
Hi,
You want to talk like people? Let's start with this: My name is Joy and
I worked very hard on that bug report. Everything in it is accurate, and
for you to reduce it to "garbage" is not very professional.
My apologies if I came across rash. My time is valuable to me, and I've had multiple people reach out to me from CLAW instances and I've started to see the formula of LLM output. It's incredibly tempting to trust, but it's too often wrong when it comes to debugging issues that cross kernel and hardware.
So yes - I want to confirm I'm talking to a human with how much garbage is flying around and it looks like LLM formed output.
Second, I
appreciate your and anyone else's help on this problem, but it is not
*my* problem. It is a problem put upon me when I purchased a product
with your company's logo on it.
As I said in my original post, I'm happy to test anything you want with
very quick turnaround, and you asked for an amd-s2idle report from an
unpatched mainline kernel. You got it!
Thanks again for your help.
Joy
Can you please explain what root port corruption means? Are you actually meaning graphical corruption?
A few things I want to point out after resume that might be relevant.
> pci 0000:c4:00.0: Unable to change power state from D0 to D0, device inaccessible
> snd_hda_intel 0000:c4:00.1: Unable to change power state from D3hot to D0, device inaccessible
Both of these devices are part of the discrete GPU (VGA and Audio class endpoints). So neither came back to the right state, which to me has me wondering if the EC is cutting a power rail used for the discrete GPU and it's coming back at the wrong time.
It wouldn't be the first time that the kernel raced with firmware and lost.
What happens if you pass a PCIe reset to the root port above them after resume? Does it recover?