Re: [PATCH v6 2/2] cpuidle: teo: Introduce util-awareness

From: Vincent Guittot
Date: Fri May 31 2024 - 04:57:57 EST


On Wed, 29 May 2024 at 15:09, Christian Loehle <christian.loehle@xxxxxxx> wrote:
>
> On 5/28/24 15:07, Vincent Guittot wrote:
> > On Tue, 28 May 2024 at 11:59, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
> >>
> >> Hi Vincent,
> >>
> >> On 5/28/24 10:29, Vincent Guittot wrote:
> >>> Hi All,
> >>>
> >>> I'm quite late on this thread but this patchset creates a major
> >>> regression for psci cpuidle driver when using the OSI mode (OS
> >>> initiated mode). In such a case, cpuidle driver takes care only of
> >>> CPUs power state and the deeper C-states ,which includes cluster and
> >>> other power domains, are handled with power domain framework. In such
> >>> configuration ,cpuidle has only 2 c-states : WFI and cpu off states
> >>> and others states that include the clusters, are managed by genpd and
> >>> its governor.
> >>>
> >>> This patch selects cpuidle c-state N-1 as soon as the utilization is
> >>> above CPU capacity / 64 which means at most a level of 16 on the big
> >>> core but can be as low as 4 on little cores. These levels are very low
> >>> and the main result is that as soon as there is very little activity
> >>> on a CPU, cpuidle always selects WFI states whatever the estimated
> >>> sleep duration and which prevents any deeper states. Another effect is
> >>> that it also keeps the tick firing every 1ms in my case.
> >>
> >> Thanks for reporting this.
> >> Could you add what regression it's causing, please?
> >> Performance or higher power?
> >
> > It's not a perf but rather a power regression. I don't have a power
> > counter so it's difficult to give figures but I found it while running
> > a unitary test below on my rb5:
> > run 500us every 19457ms on medium core (uclamp_min: 600).
>
> Is that supposed to say 19.457ms?

Yes, it's a mistake. it's 19.457ms I forgot to put the dot when
copying the value from the rt-app json file

> (Because below you say idle time is >18ms and total test time 5sec)
> Is the utilisation more like 1/20000 or 1/20?
> In any case what you describe is probably an issue, I'll try to reproduce.
> Note also my findings here:
> https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@xxxxxxx/
>
> Kind Regards,
> Christian
>