Re: [RFC] tick/nohz: schedule TIMER_SOFTIRQ immediately for expired timers

From: Frederic Weisbecker
Date: Fri May 26 2017 - 08:55:41 EST


On Wed, May 24, 2017 at 03:39:24PM +0300, Octavian Purdila wrote:
> Currently, when detecting expired timers in tick_nohz_stop_sched_tick
> we just schedule a new hrtimer and let its handler to schedule
> TIMER_SOFITRQ.
>
> This can lead to indefinite timer stalls if the system is busy with a
> stream of interrupts that have the period shorter or about the same
> with the value of the minimum delay of the current clocksource driver:
>
> -> idle
> -> IRQ (time = x)
> -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
> -> tick_nohz_restart
> -> cancel hrtimer (set clocksource event to x + max_delay)
> -> set clocksource to x + min_delay
> ...
> -> IRQ (time = y, where y < x + min_delay)
> -> irq_exit -> tick_nohz_irq_exit -> tick_nohz_stop_sched_tick
> -> tick_nohz_restart
> -> cancel hrtimer (set clocksource event to x + max_delay)
> -> set clocksource to y + min_delay
>
> So, instead of prodding the hrtimer interrupt, schedule TIMER_SOFTIRQ
> since we know that timers are ready. The timers will run either at the
> next interrupt or from ksoftirq so no hrtimer interrupt is
> needed. This also avoids spurious programming of the clocksource in
> this scenario.
>
> Signed-off-by: Octavian Purdila <octavian.purdila@xxxxxxx>
> ---
> kernel/time/tick-sched.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 3bcb61b..0a30278 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -727,9 +727,16 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
> *
> * Only once we exit the idle loop will we re-enable the tick,
> * see tick_nohz_idle_exit().
> + *
> + * Also, make sure we schedule TIMER_SOFTIRQ now instead of
> + * relying on the hrtimer interrupt to do it to avoid
> + * postponing processing of expired timers. If we have a
> + * constant stream of interrupts with a period shorter than
> + * the minimum delay of the current clocksource we can end up
> + * postponing the timers indefinitely.
> */
> if (delta == 0) {
> - tick_nohz_restart(ts, now);
> + raise_softirq(TIMER_SOFTIRQ);
> goto out;
> }
> }

Nice catch! And the patch should work but that actually restores a behaviour we've
removed some time ago. We wanted to get rid of that softirq raise.

So discussing this with Thomas, here is an alternate solution. That tick restart looks
like an unnecessary special case. In fact the normal path ending with hrtimer_start()
in tick_nohz_stop_sched_tick() should work if the delta deadline is 0. And if
we do so, we benefit from the optimization path:

if (ts->tick_stopped && (expires == dev->next_event))

...which avoids the cancel/reprog game and therefore should fix that issue.

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index ed18ca5..58c257c 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -713,8 +713,6 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
*/
delta = next_tick - basemono;
if (delta <= (u64)TICK_NSEC) {
- tick = 0;
-
/*
* Tell the timer code that the base is not idle, i.e. undo
* the effect of get_next_timer_interrupt():
@@ -724,23 +722,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
* We've not stopped the tick yet, and there's a timer in the
* next period, so no point in stopping it either, bail.
*/
- if (!ts->tick_stopped)
- goto out;
-
- /*
- * If, OTOH, we did stop it, but there's a pending (expired)
- * timer reprogram the timer hardware to fire now.
- *
- * We will not restart the tick proper, just prod the timer
- * hardware into firing an interrupt to process the pending
- * timers. Just like tick_irq_exit() will not restart the tick
- * for 'normal' interrupts.
- *
- * Only once we exit the idle loop will we re-enable the tick,
- * see tick_nohz_idle_exit().
- */
- if (delta == 0) {
- tick_nohz_restart(ts, now);
+ if (!ts->tick_stopped) {
+ tick = 0;
goto out;
}
}