Re: [PATCH 08/10] psi: pressure stall information for CPU, memory, and IO
From: Johannes Weiner
Date: Fri Jul 20 2018 - 10:11:10 EST
On Wed, Jul 18, 2018 at 06:06:23PM -0400, Johannes Weiner wrote:
> On Tue, Jul 17, 2018 at 05:01:42PM +0200, Peter Zijlstra wrote:
> > On Thu, Jul 12, 2018 at 01:29:40PM -0400, Johannes Weiner wrote:
> > > +static bool psi_update_stats(struct psi_group *group)
> > > +{
> > > + u64 some[NR_PSI_RESOURCES] = { 0, };
> > > + u64 full[NR_PSI_RESOURCES] = { 0, };
> > > + unsigned long nonidle_total = 0;
> > > + unsigned long missed_periods;
> > > + unsigned long expires;
> > > + int cpu;
> > > + int r;
> > > +
> > > + mutex_lock(&group->stat_lock);
> > > +
> > > + /*
> > > + * Collect the per-cpu time buckets and average them into a
> > > + * single time sample that is normalized to wallclock time.
> > > + *
> > > + * For averaging, each CPU is weighted by its non-idle time in
> > > + * the sampling period. This eliminates artifacts from uneven
> > > + * loading, or even entirely idle CPUs.
> > > + *
> > > + * We could pin the online CPUs here, but the noise introduced
> > > + * by missing up to one sample period from CPUs that are going
> > > + * away shouldn't matter in practice - just like the noise of
> > > + * previously offlined CPUs returning with a non-zero sample.
> >
> > But why!? cpuu_read_lock() is neither expensive nor complicated. So why
> > try and avoid it?
>
> Hm, I don't feel strongly about it either way. I'll add it.
Thinking more about it, this really doesn't buy anything. Whether a
CPU comes online or goes offline during the loop is no different than
that happening right before grabbing the cpus_read_lock(). If we see a
sample from a CPU, we incorporate it, if not we don't.
So it's not so much avoidance as it's lack of reason for synchronizing
against hotplugging in any fashion. The comment is wrong. This noise
it points to is there with and without the lock, and the only way to
avoid it would be to do either for_each_possible_cpu() in that loop or
having a hotplug callback that would flush the offlining CPU bucket
into a holding place for missed dead cpu samples that the aggregation
loop checks every time. Neither of these seem remotely worth the cost.
I'll fix the comment instead.