Re: [PATCH v4 2/2] sched: consider missed ticks in full NOHZ

From: Peter Zijlstra
Date: Mon Nov 09 2015 - 05:36:17 EST


On Mon, Nov 09, 2015 at 11:36:54AM +0900, Byungchul Park wrote:
> On Mon, Nov 02, 2015 at 05:10:16PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 14, 2015 at 06:47:36PM +0900, byungchul.park@xxxxxxx wrote:
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -4428,7 +4428,7 @@ static void update_idle_cpu_load(struct rq *this_rq)
> >
> > So if one were to read the comment above update_idle_cpu_load() one
> > would find there's a problem with jiffy based accounting.
> >
> > > /*
> > > * Called from tick_nohz_idle_exit() -- try and fix up the ticks we missed.
> > > */
> > > -void update_cpu_load_nohz(void)
> > > +void update_cpu_load_nohz(int active)
> >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 7c7ec45..515edf3 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> >
> > > -static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
> > > +static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now, int active)
> > > {
> > > /* Update jiffies first */
> > > tick_do_update_jiffies64(now);
> > > - update_cpu_load_nohz();
> > > + update_cpu_load_nohz(active);
> > >
> > > calc_load_exit_idle();
> > > touch_softlockup_watchdog();
> >
> > And we could solve all that nicely if we pull up the hrtimer_forward()
> > result from tick_nohz_restart(), that way we have the actual number of
> > ticks lost on this cpu, and no need to start guessing about it.
>
> hello,
>
> are you talking about the lag between writer and reader for jiffies?
> i think your proposal can solve the problem of update_cpu_load_nohz().
> but it's still hard to care the cases of update_idle_cpu_load()
> and update_cpu_load_active() even by the way you proposed.
>
> do you think it would be ok even if it solves only one case?
> update_idle_cpu_load() still need to guess about it. is there something
> i missed? or did i mis-understand what you intend?

I was thinking of getting rid of rq->last_load_update_tick entirely. If
we can pass in how many (local) ticks were lost on this cpu, we don't
have to rely on the jiffy counter at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/