Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional util-awareness
From: Rafael J. Wysocki
Date: Fri Oct 28 2022 - 11:05:14 EST
On Fri, Oct 28, 2022 at 5:01 PM Kajetan Puchalski
<kajetan.puchalski@xxxxxxx> wrote:
>
> On Fri, Oct 28, 2022 at 03:12:43PM +0200, Rafael J. Wysocki wrote:
>
> > > The result being that this util-aware TEO variant while using much less
> > > C1 and decreasing the percentage of too deep sleeps from ~24% to ~3% in
> > > PCMark Web Browsing also uses almost 2% less power. Clearly the power is
> > > being wasted on not hitting C1 residency over and over.
> >
> > Hmm. The PCMark Web Browsing table in your cover letter doesn't indicate that.
> >
> > The "gmean power usage" there for "teo + util-aware" is 205, whereas
> > for "teo" alone it is 187.8. This is still arguably balanced by the
> > latency difference (~100 us vs ~185 us, respectively), but this looks
> > like trading energy for performance.
>
> In this case yes, I meant 2% less compared to menu but you're right of
> course.
>
> [...]
>
> > Definitely it should not be changed if the previous state is a polling
> > one which can be checked right away. That would take care of the
> > "Intel case" automatically.
>
> Makes sense, I already used the polling flag to implement this in this other
> governor I mentioned.
>
> >
> > > Should make it much less intense for Intel systems.
> >
> > So I think that this adjustment only makes sense if the current
> > candidate state is state 1 and state 0 is not polling. In the other
> > cases the cost of missing an opportunity to save energy would be too
> > high for the observed performance gain.
>
> Interesting, but only applying it to C1 and only when C0 isn't polling would
> make it effectively not do anything on Intel systems, right?
Indeed.
> From what I've seen on Doug's plots even C1 is hardly ever used on his platform, most
> sleeps end up in the deepest possible state.
That depends a lot on the workload. There are workloads in which C1
is mostly used and the deeper idle states aren't.
> Checking for the polling flag is a good idea regardless so I can send a
> v3 with that. If you'd like me to also restrict the entire mechanism to
> only working on C1 as you suggested then I'm okay with including that in
> the v3 as well. What do you think?
It would be good to do that and see if there are any significant
differences in the results.
> Thanks a lot for all your time & input,
No problem at all.