Re: kernel/timer: avoid spurious ksoftirqd wakeups (v2)

From: Thomas Gleixner
Date: Tue Apr 07 2015 - 17:10:33 EST


On Mon, 6 Apr 2015, Marcelo Tosatti wrote:
> It is only necessary to raise timer softirq
> in case there are active timers.

Depends. See below.

> Limit the ksoftirqd wakeup to that case.
>
> Fixes a latency spike with isolated CPUs and
> nohz full mode.

This lacks a proper explanation of the observed issue.

> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -568,6 +568,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> unsigned long rcu_delta_jiffies;
> struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
> u64 time_delta;
> + bool raise_softirq = false;

This shadows the function name raise_softirq(). Not pretty.

> time_delta = timekeeping_max_deferment();
>
> @@ -584,7 +585,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> delta_jiffies = 1;
> } else {
> /* Get the next timer wheel timer */
> - next_jiffies = get_next_timer_interrupt(last_jiffies);
> + next_jiffies = get_next_timer_interrupt(last_jiffies,
> + &raise_softirq);
> delta_jiffies = next_jiffies - last_jiffies;
> if (rcu_delta_jiffies < delta_jiffies) {
> next_jiffies = last_jiffies + rcu_delta_jiffies;
> @@ -703,7 +705,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> */
> tick_do_update_jiffies64(ktime_get());
> }
> - raise_softirq_irqoff(TIMER_SOFTIRQ);
> + if (raise_softirq)
> + raise_softirq_irqoff(TIMER_SOFTIRQ);

This breaks when high resolution timers are disabled (compile or
runtime) because then the hrtimer queues are run from the timer
softirq.

Now assume the following situation:

Tick is stopped completely with no timers and no hrtimers pending.

Interrupt happens and schedules a hrtimer.

nohz_stop_sched_tick()
get_next_timer_interrupt(..., &raise_softirq);

---> base->active_timers = 0, so raise_softirq is false

tick_program_event(expires)
clockevents_program_event(expires)

---> Assume expires is already in the past

if (expires <= ktime_get())
return -ETIME;

if (raise_softirq)
raise_softirq_irqoff(TIMER_SOFTIRQ);

So because the tick device was not armed you wont get a tick
interrupt up to the point where tick_nohz_stop_sched_tick() is called
again which might be far off.

I can see that the unconditional raise_softirq_irqoff() is suboptimal,
but it was a rather simple solution to get stuff rolling again because
it forces the cpu out of the inner idle loop which in turn restarts
the tick.

We certainly can do better, but that needs more thought than the
proposed solution.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/