Re: [PATCH 2/4] sched: cpufreq: Fix apply_dvfs_headroom() escaping uclamp constraints

From: Qais Yousef
Date: Sat Sep 16 2023 - 15:26:09 EST


On 09/12/23 18:03, Vincent Guittot wrote:

> And it seems that what is done today doesn't work correctly for you.
> Your proposal to include cpufreq headroom into the scheduler is not
> correct IMO and it only applies for some cases. Then, the cpufreq
> driver can have some really good reason to apply some headroom even
> with an uclamp value but it can't make any decision.
>
> I think that we should use another way to fix your problem with how
> uclamp than reordering how headroom is applied by cpufreq. Mixing
> utilization and performance in one signal hide some differences that
> cpufreq can make a good use of.
>
> As an example:
>
> cfs util = 650
> cfs uclamp = 800
> irq = 128
>
> cfs with headroom 650*1.25=812 is clamped to 800
>
> Final utilization will be : 800(1024-128)/1024+128*1.25=860 which is
> above the target of 800.
>
> When we look at the detail, we have:
>
> cfs util once scaled to the full range is only 650(1024-128)/1024= 568
>
> After applying irq (even with some headroom) 568+128*1.25 = 728 which
> is below the uclamp of 800 so shouldn't we stay at 800 in this case ?

Shouldn't it be (568+128)*1.25 = 870? Which is almost the 860 above. We calmped
the 812 to 800, with rounding errors that almost accounts for the 10 points
difference between 870 and 860..

I might have gotten the math wrong. But what I saw is that we have

util = (X + Y + Z) * A

and what I did

util = AX + AY + AZ

so maybe I missed something up, but I just did the multiplication with the
headroom to each element individually rather than after the sum.

So yeah, if I messed that part up, then that wasn't intentional and should be
done differently. But I still can't see it.

> >
> > The main change being done here actually is to apply_dvfs_headroom() *before*
> > doing uclamp_rq_util_with(). I am not sure how you see this mixing.
>
> Because dvfs_headroom is a cpufreq hints and you want to apply it
> somewhere else.

I am still not sure if you mean we are mixing up the code and we need better
abstraction or something else.

Beside the abstraction problem, which I agree with, I can't see what I am
mixing up yet :( Sorry I think I need more helping hand to see it.

> > Current code performs apply_dvfs_headroom() *after*; which what causes the CPU
> > to run at a performance level higher than rq->uclamp[UCLAMP_MAX].
> >
> > It doesn't matter how many tasks on the rq, if rq->uclamp[UCLAMP_MAX] is set to
> > 800, then the CPU should not vote to max (assuminig all other pressures are 0).
>
> You can't remove the irq pressure from the picture. If
> rq->uclamp[UCLAMP_MAX] is set to 800 means that cpu must not go above
> 800, it should apply also after taking into account other inputs. At
> least up to some level as described in my example above

I was trying to simplify to understand what you mean as I don't think I see the
problem you're highlighting still.


Thanks!

--
Qais Yousef