Re: [PATCH v3 0/3] Add support for AArch64 AMUv1-based arch_freq_get_on_cpu

From: Beata Michalska
Date: Wed Apr 03 2024 - 17:35:02 EST


On Mon, Mar 25, 2024 at 09:10:26AM -0700, Vanshidhar Konda wrote:
> On Tue, Mar 12, 2024 at 08:34:28AM +0000, Beata Michalska wrote:
> > Introducing arm64 specific version of arch_freq_get_on_cpu, cashing on
> > existing implementation for FIE and AMUv1 support: the frequency scale
> > factor, updated on each sched tick, serves as a base for retrieving
> > the frequency for a given CPU, representing an average frequency
> > reported between the ticks - thus its accuracy is limited.
> >
> > The changes have been rather lightly (due to some limitations) tested on
> > an FVP model.
> >
>
> I tested these changes on an Ampere system. The results from reading
> scaling_cur_freq look reasonable in the majority of cases I tested. I
> only saw some unexpected behavior with cores that were configured for
> no_hz full.
>
> I observed the unexplained behavior when I tested as follows:
> 1. Run stress on all cores
> stress-ng --cpu 186 --timeout 10m --metrics-brief
> 2. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
> scaling_cur_freq values were within a few % of cpuinfo_cur_freq
> 3. Kill stress test
> 4. Observe scaling_cur_freq and cpuinfo_cur_freq for all cores
> scaling_cur_freq values were within a few % of cpuinfo_cur_freq for
> most cores except the ones configured with no_hz full.
>
> no_hz full = 122-127
> core scaling_cur_freq cpuinfo_cur_freq
> [122]: 2997070 1000000
> [123]: 2997070 1000000
> [124]: 3000038 1000000
> [125]: 2997070 1000000
> [126]: 2997070 1000000
> [127]: 2997070 1000000
>
> These values were reflected for multiple seconds. I suspect the cores
> entered WFI and there was no update to the scale while those cores were
> idle.
>
Right, so the problem is with updating the counters upon entering idle, which at
this point is being done for all CPUs, and it should exclude the full dynticks
ones - otherwise it leads to such bad readings. So for nohz_full cores cpufreq
driver will have to take care of getting the current frequency.

Will be sending a fix for that.

Thank you very much for testing - appreciate that!

---
BR
Beata
> Thanks,
> Vanshi