Re: [PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr

From: Naveen N. Rao
Date: Thu Apr 02 2020 - 03:34:52 EST


Gautham R Shenoy wrote:
Hello Naveen,


On Wed, Apr 01, 2020 at 03:28:48PM +0530, Naveen N. Rao wrote:
Gautham R. Shenoy wrote:
>From: "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx>
>
[..snip..]

>+
>+static ssize_t show_purr(struct device *dev,
>+ struct device_attribute *attr, char *buf)
> {
>- u64 *ret = val;
>+ struct cpu *cpu = container_of(dev, struct cpu, dev);
>+ struct util_acct_stats *stats;
>
>- *ret = read_this_idle_purr();
>+ stats = get_util_stats_ptr(cpu->dev.id);
>+ return sprintf(buf, "%llx\n", stats->latest_purr);

This alters the behavior of the current sysfs purr file. I am not sure if it
is reasonable to return the same PURR value across a 10ms window.


It does reduce it to 10ms window. I am not sure if anyone samples PURR
etc faster than that rate.

I measured how much time it takes to read the purr, spurr, idle_purr,
idle_spurr files back-to-back. It takes not more than 150us. From
lparstat will these values be read back-to-back ? If so, we can reduce
the staleness_tolerance to something like 500us and still avoid extra
IPIs. If not, what is the maximum delay between the first sysfs file
read and the last sysfs file read ?

Oh, for lparstat usage, this is perfectly fine.

I meant that there could be other users of [s]purr who might care. I don't know of one, but since this is an existing sysfs interface, I wanted to point out that the behavior might change.



I wonder if we should introduce a sysctl interface to control thresholding.
It can default to 0, which disables thresholding so that the existing
behavior continues. Applications (lparstat) can optionally set it to suit
their use.

We would be introducing 3 new sysfs interfaces that way instead of
two.

/sys/devices/system/cpu/purr_spurr_staleness
/sys/devices/system/cpu/cpuX/idle_purr
/sys/devices/system/cpu/cpuX/idle_spurr

I don't have a problem with this. Nathan, Michael, thoughts on this?


The alternative is to have a procfs interface, something like
/proc/powerpc/resource_util_stats

which gives a listing similar to /proc/stat, i.e

CPUX <purr> <idle_purr> <spurr> <idle_spurr>

Even in this case, the values can be obtained in one-shot with a
single IPI and be printed in the row corresponding to the CPU.

Right -- and that would be optimal requiring a single system call, at the cost of using a legacy interface.

The other option would be to drop this patch and to just go with patches 1-5 introducing the new sysfs interfaces for idle_[s]purr. It isn't entirely clear how often this would be used, or its actual impact. We can perhaps consider this optimization if and when this causes problems...


Thanks,
Naveen