On Wed, Jan 08, 2025 at 10:15:34AM +0100, Neil Armstrong wrote:
On 08/01/2025 04:11, Bjorn Andersson wrote:
On Tue, Jan 07, 2025 at 09:13:18AM +0100, Neil Armstrong wrote:
Hi,
On 07/01/2025 00:39, Bjorn Andersson wrote:
On Fri, Jan 03, 2025 at 03:38:26PM +0100, Neil Armstrong wrote:
On the SM8650, the dynamic clock and voltage scaling (DCVS) is done in an
hardware controlled loop using the LMH and EPSS blocks with constraints and
OPPs programmed in the board firmware.
Since the Hardware does a better job at maintaining the CPUs temperature
in an acceptable range by taking in account more parameters like the die
characteristics or other factory fused values, it makes no sense to try
and reproduce a similar set of constraints with the Linux cpufreq thermal
core.
In addition, the tsens IP is responsible for monitoring the temperature
across the SoC and the current settings will heavily trigger the tsens
UP/LOW interrupts if the CPU temperatures reaches the hardware thermal
constraints which are currently defined in the DT. And since the CPUs
are not hooked in the thermal trip points, the potential interrupts and
calculations are a waste of system resources.
Instead, set higher temperatures in the CPU trip points, and hook some CPU
idle injector with a 100% duty cycle at the highest trip point in the case
the hardware DCVS cannot handle the temperature surge, and try our best to
avoid reaching the critical temperature trip point which should trigger an
inevitable thermal shutdown.
Are you able to hit these higher temperatures? Do you have some test
case where the idle-injection shows to be successful in blocking us from
reaching the critical temp?
No, I've been able to test idle-injection and observed a noticeable effect
but I had to set lower trip, do you know how I can easily "block" LMH/EPSS from
scaling down and let the temp go higher ?
I don't know how to override that configuration.
E.g. in X13s (SC8280XP) we opted for relying on LMH/EPSS and define only
the critical trip for when the hardware fails us.
It's the goal here aswell
How about simplifying the patch by removing the idle-injection step and
just rely on LMH/EPSS and the "critical" trip (at least until someone
can prove that there's value in the extra mitigation)?
OK, but I see value in this idle injection mitigation in that case LMH/EPSS
fails, the only factor in control of HLOS is by stopping scheduling tasks
since frequency won't be able to scale anymore.
I think that sounds good, but afaict we don't have any indication of
this being a problem and we don't have any way to test that it actually
solves that problem.
Anyway, I agree it can be added later on, so should I drop the 2 trip points
and only leave the critical one ?
I think that's a simple and functional starting point - and it solves
your IRQ issue.
Regards,
Bjorn