Re: [Regression] 6.11.0-rc1: AMD CPU boot with error when CPPC feature disabled by BIOS

From: Gautham R. Shenoy
Date: Mon Sep 30 2024 - 10:47:39 EST


Hello,


On Thu, Sep 26, 2024 at 01:56:21PM -0700, Luna Nova wrote:
> Hi Gautham,
>
> I'm seeing the same message on a server board with an EPYC Rome 7K62 CPU.
> CPPC is set to enabled in the UEFI firmware settings.
>
> Kernel: 6.11.0 (6.11.0 #1-NixOS SMP PREEMPT_DYNAMIC Sun Sep 15 14:57:56 UTC 2024 x86_64 GNU/Linux)
> Board: Gigabyte MZ22-G20-00 Rev 1.0 (in a G292-Z20 Rev 100)
> UEFI Firwmare: R23_F01 (2021-09-06, latest available version at time of this message)
> AGESA PI Version 1.0.0.C.

This is old! Can you check with your motherboard manufacturer if they
have a latest version available?





>
> CONFIG_ACPI_CPPC_LIB=y
> CONFIG_X86_AMD_PSTATE=y
> CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=3
> CONFIG_X86_AMD_PSTATE_UT=m
>
> $ cat /proc/cmdline
> initrd=\EFI\nixos\z16gakzlwypxbjzm5y93x10cjmxjvial-initrd-linux-6.11-initrd.efi init=/nix/store/cqhw9x7w7dc3avwri4i2lk0mgc31arll-nixos-system-tsukiakari-nixos-24.11/init sysrq_always_enabled fsck.mode=force loglevel=4 audit=0 amd_pstate=guided amd_pstate.shared_mem=1 amdgpu.lockup_timeout=10000,10000,10000,10000
> $ sudo dmesg | grep pstate
> amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect

This happens on your platform because the ACPI CPPC version on your
platform is v2 doesn't advertise the nominal_freq and lowest_freq.


> (Repeats for each core)
> amd_pstate: failed to register with return -19

Working as expected!


> stage-1-init: [Thu Sep 26 20:04:53 UTC 2024] loading module amd_pstate_ut...
> amd_pstate_ut: 1 amd_pstate_ut_acpi_cpc_valid success!
> amd_pstate_ut: 2 amd_pstate_ut_check_enabled success!
> amd_pstate_ut: 3 amd_pstate_ut_check_perf success!
> amd_pstate_ut: 4 amd_pstate_ut_check_freq success!
>
> It seems odd that amd_pstate fails to load but amd_pstate_ut reports success for all checks.

Hmm.. This is strange. I need to check why this is happening.

>
> > it appears that the CPPC version on your platform is v2 which does not
> > advertise the nominal_freq and the lowest_freq. In the absence of these,
> > it is not possible for the amd-pstate driver to infer the
> > min/max_freq. Which is why the driver bails at this later stage.
>
> > The way around it is to add a quirk for your BIOS as done in this commit
> > from Perry:
> > eb8b6c368202 ("cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing")
>
> Perry's patch you referenced as an example above targets the same 7K62 CPU but requires one specific BIOS version.

Yes, Perry's solution targets a BIOS version as a more recent version
of the BIOS may advertise CPPC v3 which is what the amd-pstate driver
expects.

> Should I submit a patch adding the version on this system to that quirk?

Yes. If your board manufacturer does not have a latest version of the
firmware that advertises CPPC v3 that is..

>
> I'm confused by the quirk code: it's called "AMD EPYC 7K62" but it matches by BIOS revision and doesn't check the CPU model.


> An earlier version of the quirk included `boot_cpu_data.x86 == 0x17 && boot_cpu_data.x86_model == 0x31` to check the model; it now uses the nominal frequencies for a 7K62 regardless of the CPU model if the BIOS revision matches.

When you boot your system with acpi_cpufreq, what is the P0 Pstate
frequency ? Is it same as the one used in the quirk ?

>
> Best,
> Luna

--
Thanks and Regards
gautham.