Re: [PATCH v2 2/5] cpufreq: intel_pstate: Always return last EPP value from sysfs

From: Srinivas Pandruvada
Date: Tue Aug 25 2020 - 11:27:05 EST


On Tue, 2020-08-25 at 17:14 +0200, Rafael J. Wysocki wrote:
> On Tue, Aug 25, 2020 at 5:06 PM Srinivas Pandruvada
> <srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
> > On Tue, 2020-08-25 at 16:51 +0200, Rafael J. Wysocki wrote:
> > > On Tue, Aug 25, 2020 at 8:20 AM Artem Bityutskiy <
> > > dedekind1@xxxxxxxxx
> > > > wrote:
> > > > On Mon, 2020-08-24 at 19:42 +0200, Rafael J. Wysocki wrote:
> > > > > From: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx>
> > > > >
> > > > > Make the energy_performance_preference policy attribute in
> > > > > sysfs
> > > > > always return the last EPP value written to it instead of the
> > > > > one
> > > > > currently in the HWP Request MSR to avoid possible confusion
> > > > > when
> > > > > the performance scaling algorithm is used in the active mode
> > > > > with
> > > > > HWP enabled (in which case the EPP is forced to 0 regardless
> > > > > of
> > > > > what value it has been set to via sysfs).
> > > >
> > > > Why is this a good idea, I wonder. If there was a prior
> > > > discussion,
> > > > please, point to it.
> > > >
> > > > The general approach to changing settings via sysfs is often
> > > > like
> > > > this:
> > > >
> > > > 1. Write new value.
> > > > 2. Read it back and verify that it is the same. Because there
> > > > is no
> > > > better way to verify that the kernel "accepted" the value.
> > >
> > > If the write is successful (ie. no errors returned and the value
> > > returned is equal to the number of written characters), the
> > > kernel
> > > *has* accepted the written value, but it may not have taken
> > > effect.
> > > These are two different things.
> > >
> > > The written value may take an effect immediately or it may take
> > > an
> > > effect later, depending on the current configuration etc. If you
> > > don't see the effect of it immediately, it doesn't matter that
> > > there
> > > was a failure of some sort.
> > >
> > > > Let's say I write 'balanced' to energy_performance_preference.
> > > > I
> > > > read
> > > > it back, and it contains 'balanced', so I am happy, I trust the
> > > > kernel
> > > > changed EPP to "balanced".
> > > >
> > > > If the kernel, in fact, uses something else, I want to know
> > > > about
> > > > it
> > > > and have my script fail.
> > >
> > > Why do you want it to fail then?
> > >
> > > > Why caching the value and making my script _think_ it succeeded
> > > > is
> > > > a good idea.
> > >
> > > Because when you change the scaling algorithm or the driver's
> > > operation mode, the value you have written will take effect.
> > >
> > > In this particular case it is explained in the driver
> > > documentation
> > > that the performance scaling algorithm in the active mode
> > > overrides
> > > the sysfs value and that's the only case when it can be
> > > overridden.
> > > So whatever you write to this attribute will not take effect
> > > immediately anyway, but it may take an effect later.
> >
> > In some cases without even changing active/passive this is
> > happening
> > when there was some error previously. For example:
> >
> > #cat energy_performance_preference
> > 127
> > [root@otcpl-perf-test-skx-i9 cpufreq]# rdmsr -p 1 0x774
> > 8000ff00
> >
> > I think we should show reality. In mode change can be a special
> > case
> > and use the stored value to restore in new mode.
>
> OK, so I'll make it fail on attempts to change the EPP from 0
> (performance) in the active mode with the performance "governor".
>
Here the scaling governor is powersave.

# cat scaling_governor
powersave


Thanks,
Srinivas

> Cheers!