Re: [PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr
From: Gautham R Shenoy
Date: Wed Apr 01 2020 - 08:01:47 EST
Hello Naveen,
On Wed, Apr 01, 2020 at 03:28:48PM +0530, Naveen N. Rao wrote:
> Gautham R. Shenoy wrote:
> >From: "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx>
> >
[..snip..]
> >-static DEVICE_ATTR(spurr, 0400, show_spurr, NULL);
> >-static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
> > static DEVICE_ATTR(pir, 0400, show_pir, NULL);
> > static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr);
> > #endif /* CONFIG_PPC64 */
> >@@ -761,22 +757,110 @@ static void create_svm_file(void)
> > }
> > #endif /* CONFIG_PPC_SVM */
> >
> >+#ifdef CONFIG_PPC64
> >+/*
> >+ * The duration (in ms) from the last IPI to the target CPU until
> >+ * which a cached value of purr, spurr, idle_purr, idle_spurr can be
> >+ * reported to the user on a corresponding sysfs file read. Beyond
> >+ * this duration, fresh values need to be obtained by sending IPIs to
> >+ * the target CPU when the sysfs files are read.
> >+ */
> >+static unsigned long util_stats_staleness_tolerance_ms = 10;
>
> This is a nice optimization for our use in lparstat, though I have a concern
> below.
>
> >+struct util_acct_stats {
> >+ u64 latest_purr;
> >+ u64 latest_spurr;
> >+#ifdef CONFIG_PPC_PSERIES
> >+ u64 latest_idle_purr;
> >+ u64 latest_idle_spurr;
> >+#endif
>
> You can probably drop the 'latest_' prefix.
Sure.
>
> >+ unsigned long last_update_jiffies;
> >+};
> >+
> >+DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats);
>
> Per snowpatch, this should be static, and so should get_util_stats_ptr()
> below:
> https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/16601//artifact/linux/report.txt
Ok, will fix this in v5.
>
> >+
> >+static void update_util_acct_stats(void *ptr)
> >+{
> >+ struct util_acct_stats *stats = ptr;
> >+
> >+ stats->latest_purr = mfspr(SPRN_PURR);
> >+ stats->latest_spurr = mfspr(SPRN_SPURR);
> > #ifdef CONFIG_PPC_PSERIES
> >-static void read_idle_purr(void *val)
> >+ stats->latest_idle_purr = read_this_idle_purr();
> >+ stats->latest_idle_spurr = read_this_idle_spurr();
> >+#endif
> >+ stats->last_update_jiffies = jiffies;
> >+}
> >+
> >+struct util_acct_stats *get_util_stats_ptr(int cpu)
> >+{
> >+ struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu);
> >+ unsigned long delta_jiffies;
> >+
> >+ delta_jiffies = jiffies - stats->last_update_jiffies;
> >+
> >+ /*
> >+ * If we have a recent enough data, reuse that instead of
> >+ * sending an IPI.
> >+ */
> >+ if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms)
> >+ return stats;
> >+
> >+ smp_call_function_single(cpu, update_util_acct_stats, stats, 1);
> >+ return stats;
> >+}
> >+
> >+static ssize_t show_purr(struct device *dev,
> >+ struct device_attribute *attr, char *buf)
> > {
> >- u64 *ret = val;
> >+ struct cpu *cpu = container_of(dev, struct cpu, dev);
> >+ struct util_acct_stats *stats;
> >
> >- *ret = read_this_idle_purr();
> >+ stats = get_util_stats_ptr(cpu->dev.id);
> >+ return sprintf(buf, "%llx\n", stats->latest_purr);
>
> This alters the behavior of the current sysfs purr file. I am not sure if it
> is reasonable to return the same PURR value across a 10ms window.
It does reduce it to 10ms window. I am not sure if anyone samples PURR
etc faster than that rate.
I measured how much time it takes to read the purr, spurr, idle_purr,
idle_spurr files back-to-back. It takes not more than 150us. From
lparstat will these values be read back-to-back ? If so, we can reduce
the staleness_tolerance to something like 500us and still avoid extra
IPIs. If not, what is the maximum delay between the first sysfs file
read and the last sysfs file read ?
>
> I wonder if we should introduce a sysctl interface to control thresholding.
> It can default to 0, which disables thresholding so that the existing
> behavior continues. Applications (lparstat) can optionally set it to suit
> their use.
We would be introducing 3 new sysfs interfaces that way instead of
two.
/sys/devices/system/cpu/purr_spurr_staleness
/sys/devices/system/cpu/cpuX/idle_purr
/sys/devices/system/cpu/cpuX/idle_spurr
I don't have a problem with this. Nathan, Michael, thoughts on this?
The alternative is to have a procfs interface, something like
/proc/powerpc/resource_util_stats
which gives a listing similar to /proc/stat, i.e
CPUX <purr> <idle_purr> <spurr> <idle_spurr>
Even in this case, the values can be obtained in one-shot with a
single IPI and be printed in the row corresponding to the CPU.
>
> - Naveen
>
--
Thanks and Regards
gautham.