On Wed, 30 Nov 2022 at 15:04, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
Hi Vincent,
On 11/30/22 10:42, Vincent Guittot wrote:
Hi All
Just for the log and because it took me a while to figure out the root
cause of the problem: This patch also creates a regression for
snapdragon845 based systems and probably on any QC chipsets that use a
LUT to update the OPP table at boot. The behavior is the same as
described by Sam with a staled value in sugov_policy.max field.
Thanks for sharing this info and apologies that you spent cycles
on it.
I have checked that whole setup code (capacity + cpufreq policy and
governor). It looks like to have a proper capacity of CPUs, we need
to wait till the last policy is created. It's due to the arch_topology.c
mechanism which is only triggered after all CPUs' got the policy.
Unfortunately, this leads to a chicken & egg situation for this
schedutil setup of max capacity.
I have experimented with this code, which triggers an update in
the schedutil, when all CPUs got the policy and sugov gov
(with trace_printk() to mach the output below)
Your proposal below looks similar to what is done in arch_topology.c.
arch_topology.c triggers a rebuild of sched_domain and removes its
cpufreq notifier cb once it has visited all CPUs, could it also
trigger an update of CPU's policy with cpufreq_update_policy() ?
At this point you will be sure that the normalization has happened and
the max capacity will not change.
I don't know if it's a global problem or only for systems using arch_topology