Re: [PATCH] base: arch_topology: Use policy->max to calculate freq_factor

From: Rafael J. Wysocki
Date: Wed Nov 17 2021 - 10:17:56 EST


On Wed, Nov 17, 2021 at 4:08 PM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
>
>
>
> On 11/17/21 12:49 PM, Rafael J. Wysocki wrote:
> > On Wed, Nov 17, 2021 at 11:46 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
> >>
> >> Hi Rafael,
> >>
> >> On 11/16/21 7:05 PM, Rafael J. Wysocki wrote:
> >>> On Mon, Nov 15, 2021 at 9:10 PM Thara Gopinath
> >>> <thara.gopinath@xxxxxxxxxx> wrote:
> >>>>
> >>>> cpuinfo.max_freq can reflect boost frequency if enabled during boot. Since
> >>>> we don't consider boost frequencies while calculating cpu capacities, use
> >>>> policy->max to populate the freq_factor during boot up.
> >>>
> >>> I'm not sure about this. schedutil uses cpuinfo.max_freq as the max frequency.
> >>
> >> Agree it's tricky how we treat the boost frequencies and also combine
> >> them with thermal pressure.
> >> We probably would have consider these design bits:
> >> 1. Should thermal pressure include boost frequency?
> >
> > Well, I guess so.
> >
> > Running at a boost frequency certainly increases thermal pressure.
> >
> >> 2. Should max capacity 1024 be a boost frequency so scheduler
> >> would see it explicitly?
> >
> > That's what it is now if cpuinfo.max_freq is a boost frequency.
> >
> >> - if no, then schedutil could still request boost freq thanks to
> >> map_util_perf() where we add 25% to the util and then
> >> map_util_freq() would return a boost freq when util was > 1024
> >>
> >>
> >> I can see in schedutil only one place when cpuinfo.max_freq is used:
> >> get_next_freq(). If the value stored in there is a boost,
> >> then don't we get a higher freq value for the same util?
> >
> > Yes. we do, which basically is my point.
> >
> > The schedutil's response is proportional to cpuinfo.max_freq and that
> > needs to be taken into account for the results to be consistent.
> >
>
> This boost thing wasn't an issue for us, because we didn't have
> platforms which come with it (till recently). I've checked that you have
> quite a few CPUs which support huge boost freq, e.g. 5GHz vs. 3.6GHz
> nominal max freq [1]. Am I reading this correctly as kernel boost freq?

That actually depends on the driver.

For instance, intel_pstate can be run with turbo (== boost) enabled or
disabled. If turbo is enabled, cpuinfo.max_freq is the max turbo
frequency.

In acpi_cpufreq things are sort of weird, because the highest bin in
there is a turbo frequency, but not the max one and it is used to
enable the entire turbo range. The driver sets cpuinfo.max_freq to
this one if boost is enabled IIRC.

> Do you represent this 5GHz as 1024 capacity?

Yes (but see above).

> From this schedutil get_next_freq() I would guess yes.
>
> I cannot find if you use thermal pressure, could you help me with this,
> please?

It is not used on x86 AFAICS.