Re: [PATCH 1/2] Modify cpupower to schedule itself on cores it is reading MSRs from

From: Natarajan, Janakarajan
Date: Fri Oct 11 2019 - 12:58:35 EST

On 10/10/2019 6:22 AM, Thomas Renninger wrote:
> On Monday, October 7, 2019 11:11:30 PM CEST Natarajan, Janakarajan wrote:
>> On 10/5/2019 7:40 AM, Thomas Renninger wrote:
> ...
>>>> APERF/MPERF from CPL > 0) and avoid using the msr module (patch 2).
>>> And this one only exists on latest AMD cpus, right?
>> Yes. The RDPRU instruction exists only on AMD cpus.
>>>> However, for systems that provide an instruction to get register values
>>>> from userspace, would a command-line parameter be acceptable?
>>> Parameter sounds like a good idea. In fact, there already is such a
>>> paramter.
> cpupower monitor --help
>>> -c
>>> Schedule the process on every core before starting and
>>> ending
>>> measuring. This could be needed for the Idle_Stats monitor when no other
>>> MSR based monitor (has to be run on the core that is measured) is run in
>>> parallel. This is to wake up the processors from deeper sleep states and
>>> let the kernel reaccount its cpuidle (C-state) information before reading
>>> the cpuidle timings from sysfs.
>>> Best is you exchange the order of your patches. The 2nd looks rather
>>> straight forward and you can add my reviewed-by.
>> The RDPRU instruction reads the APERF/MPERF of the cpu on which it is
>> running. If we do not schedule it on each cpu specifically, it will read the APERF/MPERF
>> of the cpu in which it runs/might happen to run on, which will not be the correct behavior.
> Got it. And I also didn't fully read -c. I now remember.. For C-states accounting
> you want to have each CPU woken up at measure start and end for accurate measuring.
> It's a pity that the monitors do the per_cpu calls themselves.
> So a general idle-monitor param is not possible or can only done by for example by
> adding a flag to the cpuidle_monitor struct:
> struct cpuidle_monitor
> unsigned int needs_root:1
> unsigned int per_cpu_schedule:1
> not sure whether a:
> struct {
> unsigned int needs_root:1
> unsigned int per_cpu_schedule:1
> } flags
> should/must be put around in a separate cleanup patch (and needs_root users adjusted).
> You (and other monitors for which this might make sense) can then implement
> the per_cpu_schedule flag. In AMD case you might want (you have to)
> directly set it.
> All around a -b/-u (--bind-measure-to-cpu, --unbind-measure-to-cpu)
> parameter could be added at some point of time if it matters. And monitors
> having this could bind or not.
> This possibly could nuke out -c param. Or at least the idle state counter
> monitor could do it itself. But don't mind about this.
> What do you think?

This is a good suggestion. I can submit a v2 with:

a) a patch to readjust the needs_root variable

b) a patch to introduce and use the per_cpu_schedule

c) a patch to introduce and use the RDPRU instruction

> And you should be able to re-use the bind_cpu function used in -c case?

Yes. I noticed that bind_cpu() is doing what I need. I will use that.



> Thomas