Re: [RFC 2/2] platform/x86/amd: pmf: Add manual control support

From: Mario Limonciello
Date: Thu Dec 19 2024 - 11:15:26 EST


On 12/19/2024 09:24, Antheas Kapenekakis wrote:
On Thu, 19 Dec 2024 at 15:50, Mario Limonciello <superm1@xxxxxxxxxx> wrote:

On 12/19/2024 07:12, Antheas Kapenekakis wrote:
Hi Mario,
given that there is a Legion Go driver in the works, and Asus already
has a driver, the only thing that would be left for locking down ACPI
access is manufacturers w/o vendor APIs.

So, can we restart the conversation about this driver? It would be
nice to get to a place where we can lock down /dev/mem and ACPI by
spring.

As Shyam mentioned we don't have control for limits by the PMF driver
for this on PMF v2 (Strix) or later platforms.

So if we were to revive this custom discussion it would only be for
Phoenix and Hawk Point platforms.

That's unfortunate.


Moreover, since the other two proposed drivers use the
firmware_attributes API, should this be used here as well?

I do feel that if we revive this conversation specifically for Phoenix
and Hawk Point platforms yes we should use the same API to expose it to
userspace as those other two drivers do.

I'd like Shyam's temperature on this idea though before anyone spends
time on it. If he's amenable would you want to work on it?

We currently expect the 2025 lineup to include a lot of Strix Point
handhelds, so I'd like a solution that works with that. OneXPlayer
released a model already, and GPD is getting ready to ship as well.

Yeah, I could throw some hours to it after I go through some overdue stuff.


By the way, you were right about needing a taint for this. Strix Point
fails to enter a lower power state during sleep if you set it to lower
than 10W. This is not ideal, as hawk point could go down to 5 while
still showing a power difference, but I am unsure where this bug
should be reported. This is both through ryzenadj/ALIB

Who is to say this is a bug? Abusing a debugging interface with a
reverse engineered tool means you might be able to configure a platform
out of specifications.

The spec being 10+W would be very undesirable for handhelds with Strix
Point, so I'd hope somebody looks into it, esp. if it can be fixed
with a BIOS fw update before more handhelds come out. I can raise the
minimum TDP to 10W, with some user complaints.

Asus and Lenovo use the same mailbox so they'd share the issue too.

FYI for a typical handheld with e.g., a 60Wh battery, a 10W envelope
results in around 20-22W total consumption which is around 2.5 hours.
Hawk Point can be TDP limited down to 16W total consumption (TDP ~7W)
and can go down to 8W with frame limiting etc. I do not have numbers
for Strix Point yet, but to match Hawk Point it has to allow TDP to go
down to 7W. I think for 2025, customer expectation will be 6-8 hours+
at low wattages.


I've got a fundamental question - why the fixation on PPT?

This just sets "limits" for the package. In Windows it's probably the best knob to tune to adjust performance in an effort to extend battery life, but in Linux we have a lot of other knobs:

* the ability to tune EPP (energy_performance_preference)
* set min and max CPU frequencies (scaling_min_freq, scaling_max_freq)
* offline cores at will
* change DPM setting in the GPU driver (power_dpm_force_performance_level)

All the core related knobs can be changed on a per-core basis. So for example even on a non-heterogeneous design you could potentially make it perform "like" a hetero design where you set it so that some cores don't go above nominal frequency or the EPP value is tuned less aggressively on some cores.

These knobs can have just as drastic of a result on battery life as adjusting the various power limiting knobs. Most importantly these knobs have architectural limits that you won't be able to override so you can safely change them to min/max and see what happens.

I feel like specifically if you keep EPP at balance_performance, keep scaling_min_freq at lowest non linear frequency and change scaling_max_freq on a few of the cores you should be able to influence the battery life quite a bit while still keeping the system responsive.