Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM

From: Lazar, Lijo
Date: Tue Jan 25 2022 - 08:37:01 EST




On 1/25/2022 5:28 AM, James Turner wrote:
Hi Lijo,

Not able to relate to how it affects gfx/mem DPM alone. Unless Alex
has other ideas, would you be able to enable drm debug messages and
share the log?

Sure, I'm happy to provide drm debug messages. Enabling everything
(0x1ff) generates *a lot* of log messages, though. Is there a smaller
subset that would be useful? Fwiw, I don't see much in the full drm logs
about the AMD GPU anyway; it's mostly about the Intel GPU.

All the messages in the system log containing "01:00" or "1002:6981" are
identical between the two versions.

I've posted below the only places in the logs which contain "amd". The
commit with the issue (f9b7f3703ff9) has a few drm log messages from
amdgpu which are not present in the logs for f1688bd69ec4.


# f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack")

[drm] amdgpu kernel modesetting enabled.
vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle
ATPX version 1, functions 0x00000033
amdgpu: CRAT table not found
amdgpu: Virtual CRAT table created for CPU
amdgpu: Topology: Add CPU node


# f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")

[drm] amdgpu kernel modesetting enabled.
vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle
ATPX version 1, functions 0x00000033
[drm:amdgpu_atif_pci_probe_handle.isra.0 [amdgpu]] Found ATIF handle \_SB_.PCI0.GFX0.ATIF
[drm:amdgpu_atif_pci_probe_handle.isra.0 [amdgpu]] ATIF version 1
[drm:amdgpu_acpi_detect [amdgpu]] SYSTEM_PARAMS: mask = 0x6, flags = 0x7
[drm:amdgpu_acpi_detect [amdgpu]] Notification enabled, command code = 0xd9
amdgpu: CRAT table not found
amdgpu: Virtual CRAT table created for CPU
amdgpu: Topology: Add CPU node



Hi James,

Specifically, I was looking for any events happening at these two places because of the patch-

https://elixir.bootlin.com/linux/v5.16/source/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c#L411

https://elixir.bootlin.com/linux/v5.16/source/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c#L653

The patch specifically affects these two. On/before starting VM, if there are invocations of these two functions on your system as a result of the patch, we could navigate from there and check what is the side effect.

Thanks,
Lijo

Other things I'm willing to try if they'd be useful:

- I could update to the 21.Q4 Radeon Pro driver in the Windows VM. (The
21.Q3 driver is currently installed.)

- I could set up a Linux guest VM with PCI passthrough to compare to the
Windows VM and obtain more debugging information.

- I could build a kernel with a patch applied, e.g. to disable some of
the changes in f9b7f3703ff9.

James