Re: [PATCH] timer: Lazily wakup nohz CPU when adding new timer.
From: Frederic Weisbecker
Date: Tue Oct 27 2015 - 11:11:24 EST
On Wed, Oct 21, 2015 at 04:16:31PM +0530, Viresh Kumar wrote:
> Cc'ing Frederic.
>
> On 20-10-15, 15:47, Yunhong Jiang wrote:
> > On Sun, Oct 11, 2015 at 08:12:39PM +0200, Thomas Gleixner wrote:
> > > On Mon, 28 Sep 2015, Yunhong Jiang wrote:
> > > > static void internal_add_timer(struct tvec_base *base, struct timer_list *timer)
> > > > {
> > > > + bool kick_nohz = false;
> > > > +
> > > > /* Advance base->jiffies, if the base is empty */
> > > > if (!base->all_timers++)
> > > > base->timer_jiffies = jiffies;
> > > > @@ -424,9 +426,17 @@ static void internal_add_timer(struct tvec_base *base, struct timer_list *timer)
> > > > */
> > > > if (!(timer->flags & TIMER_DEFERRABLE)) {
> > > > if (!base->active_timers++ ||
> > > > - time_before(timer->expires, base->next_timer))
> > > > + time_before(timer->expires, base->next_timer)) {
> > > > base->next_timer = timer->expires;
> > > > - }
> > > > + /*
> > > > + * CPU in dynticks need reevaluate the timer wheel
> > > > + * if newer timer added with next_timer updated.
> > > > + */
> > > > + if (base->nohz_active)
> > > > + kick_nohz = true;
> > > > + }
> > > > + } else if (base->nohz_active && tick_nohz_full_cpu(base->cpu))
> > > > + kick_nohz = true;
> > >
> > > Why do you want to kick the other cpu when a deferrable timer got added?
> >
> > This is what happens in current implementation and this patch does not
> > change the logic. According to the comments, it's to avoid race with
> > idle_cpu(). Frankly speaking, I didn't get the idea of the race.
> >
> > Viresh, do you have any hints?
>
> I haven't looked at the core since few months now and looks like I
> don't remember anything :)
>
> This thread is where we discussed it initially:
> http://marc.info/?l=linux-kernel&m=139039035809125
>
> AFAIU, this is why we kick the other CPU for a deferrable timer:
> - The other CPU is a full-dynticks capable CPU and may be running
> tickless and we should serve the timer in time (even if it is
> deferrable) if the CPU isn't idle.
> - We could have saved the kick for a full-dynticks idle CPU, but a
> race can happen where we thought the CPU is idle, but it has just
> started serving userspace tick-lessly. And the timer wouldn't be
> served for long time, even when the cpu was busy.
Yeah, deferrable implies "idle deferrable", not "user deferrable". So if
the CPU is tickless in userland, we want to kick it to handle the
deferrable timer.
This is further optimizable by making sure that we don't kick nohz full
CPUs when they are idle as well.
That said the handling of deferrable timers is buggy in full dynticks because
get_next_timer_interrupt() ignores deferrable timers always. I should
pass an "ignore_deferrable" parameter for non-idle dynticks but I'm afraid
this will uncover timers that people don't want to see. But we'll have to
do it eventually and maybe audit which of these deferrable timers also implies
deferrable in userspace.
>
> Ofcourse, Frederic will kick me if I forgot the lessons he gave me
> earlier :)
Oh I know too much how easy it is to forget what $CODE_I_REVIEWED_X_MONTH_AGO
actually does ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/