Re: [PATCH 2/3] sched: Correctly handle nohz ticks cpu load accounting

From: Peter Zijlstra
Date: Fri Apr 08 2016 - 13:41:16 EST


On Fri, Apr 08, 2016 at 02:53:21PM +0200, Frederic Weisbecker wrote:
> On Fri, Apr 08, 2016 at 11:41:54AM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 08, 2016 at 03:07:12AM +0200, Frederic Weisbecker wrote:
> > > +void cpu_load_update_nohz_start(void)
> > > {
> > > struct rq *this_rq = this_rq();
> > > +
> > > + /*
> > > + * This is all lockless but should be fine. If weighted_cpuload changes
> > > + * concurrently we'll exit nohz. And cpu_load write can race with
> > > + * cpu_load_update_idle() but both updater would be writing the same.
> > > + */
> > > + this_rq->cpu_load[0] = weighted_cpuload(cpu_of(this_rq));
> > > +}
> >
> > There is more to this; this also updates ->cpu_load[0] at possibly much
> > higher frequency than we've done before, while not updating the other
> > ->cpu_load[] members.
> >
> > Now, I'm not sure we care, but it is a bit odd.
>
> This is right. cpu_load[0] is aimed at showing the most recent load,
> without knowing when was the last update (the last tick/update could
> have been recent or not, the readers shouldn't care). Now we can
> indeed worry about it if this field is read altogether with the other
> indexes and those are put into some relation. But it seems that each
> of the rq->cpu_load[i] are read independently without relation or
> comparison. Now really I'm saying that without much assurance as I
> don't know the details of the load balancing code.
>
> If in doubt I can create a field in struct rq to record the tickless
> load instead of storing it in cpu_load[0]. That was in fact the first
> direction I took in my drafts.

Yeah, no I think this is fine, just maybe wants a comment.

> > > +/*
> > > + * Account the tickless load in the end of a nohz frame.
> > > + */
> > > +void cpu_load_update_nohz_stop(void)
> > > +{
> > > unsigned long curr_jiffies = READ_ONCE(jiffies);
> > > + struct rq *this_rq = this_rq();
> > > + unsigned long load;
> > >
> > > if (curr_jiffies == this_rq->last_load_update_tick)
> > > return;
> > >
> > > + load = weighted_cpuload(cpu_of(this_rq));
> > > raw_spin_lock(&this_rq->lock);
> > > + cpu_load_update_nohz(this_rq, curr_jiffies, load);
> > > raw_spin_unlock(&this_rq->lock);
> > > }
> >
> > And this makes us take rq->lock when waking from nohz; a bit
> > unfortunate. Do we really need this though?
>
> Note it was already the case before this patchset, the function was called
> cpu_load_update_nohz() instead.

Ah, ok. I got lost in the whole rename + nohz maze (again). OK no
problem then.