Re: [PATCH] intel_idle: Add ICL support

From: Rafael J. Wysocki
Date: Wed Aug 26 2020 - 12:03:04 EST


On Wed, Aug 26, 2020 at 4:04 PM Guilhem Lettron <guilhem@xxxxxxxxxxx> wrote:
>
> On Wed, 26 Aug 2020 at 15:41, Zhang Rui <rui.zhang@xxxxxxxxx> wrote:
> >
> >
> > This is really hard to read.
> > can you please attach the two turbostat output as attachments?
>
> of course :)

Thanks!

A couple of things happen here AFAICS.

First, your processor seems to be unable to enter package C-states
below PC3, so probably there is a device (most likely a PCI one)
preventing it from doing that in the system. If all goes well, it
should be able to get to at least PC8 without suspending the whole
system. That needs to be dealt with in the first place before we can
draw meaningful conclusions regarding which set of C-states to expose
and whether or not the one exposed via ACPI is sufficient.

To that end, I would try to upgrade the graphics firmware and see if
you can get some nonzero PC8 residency then.

Second, ACPI exposes C1, C7s and C10 only and so you don't get any
CPU-C6 residency without the patch, but instead you get more CPU-C7
residency and more CPU-C1 residency. It is hard to say which is
better in principle, but if you look at what is asked for by the
governor, it turns out that deep C-states (C8-C10) are requested
around 54% of the time with the patch, whereas without it the ACPI_C3
state (corresponding to C10) is requested approximately 24% of the
time, which is much less often. That appears to translate to the
difference in PC2 residency (~30% with the patch vs ~17% without it).

Note, however, that (with the patch) C10 itself is asked for around
11% of the time which in turn is much less than the ~24% for the
corresponding ACPI_C3 (without the patch).

Overall, it looks like exposing C8 is beneficial from the energy usage
perspective, because (in the future, when the "blocking" device is
taken care of and the system can enter PC8 and deeper package
C-states) it may allow PC8 to be entered more often in principle, even
though it may reduce the amount of time spent in PC10 too (PC10 may be
generally difficult to enter, though). [Here I'm assuming that the
processor enters PC3 or PC2 instead of PC8 or deeper which cannot be
entered due to some resource dependency.]

OTOH exposing C1E doesn't seem to make much of a difference and
exposing C6 only causes it to be asked for instead of C7s, so exposing
the latter alone should be sufficient in theory.

So IMO the set of C-states exposed by ACPI looks almost enough, but
the jury is out until you can make the system be able to enter at
least PC8.

Cheers!