Re: [patch 2/3] timers: do not raise softirq unconditionally (spinlockless version)
From: Anna-Maria Gleixner
Date: Tue Jun 11 2019 - 07:49:51 EST
On Fri, 31 May 2019, Anna-Maria Gleixner wrote:
[...]
> I will think about the problem and your solution a little bit more and
> give you feedback hopefully on monday.
I'm sorry for the delay. But now I'm able to give you a detailed feedback:
The general problem is, that your solution is customized to a single
use-case: preventing softirq raise but only if there is _no_ timer
pending. To reach this goal without using locks, overhead is added to the
formerly optimized add/mod path of a timer. With your code the timer
softirq is raised even when there is a pending timer which does not has to
be expired right now. But there have been requests in the past for this use
case already.
I discussed with Thomas several approaches during the last week how to
solve the unconditional softirq timer raise in a more general way without
loosing the fast add/mod path of a timer. The approach which seems to be
the best has a dependency on a timer code change from a push to pull model
which is still under developement (see v2 patchset:
http://lkml.kernel.org/r/20170418111102.490432548@xxxxxxxxxxxxx). The
patchset v2 has several problems but we are working on a solution for those
problems right now. When the timer pull model is in place the approach to
solve the unconditional timer softirq raise could look like the following:
---8<---
The next_expiry value of timer_base struct is used to store the next expiry
value even if timer_base is not idle. Therefore it is udpated after adding
or modifying a timer and also at the end of timer softirq. In case timer
softirq does not has to be raised, the timer_base->clk is incremented to
prevent stale clocks. Checking whether timer softirq has to be raised
cannot be done lockless.
This code is not compile tested nor boot tested.
---
kernel/time/timer.c | 60 +++++++++++++++++++++++++++++++---------------------
1 file changed, 36 insertions(+), 24 deletions(-)
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -552,37 +552,32 @@ static void
static void
trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
{
- if (!is_timers_nohz_active())
- return;
-
- /*
- * TODO: This wants some optimizing similar to the code below, but we
- * will do that when we switch from push to pull for deferrable timers.
- */
- if (timer->flags & TIMER_DEFERRABLE) {
- if (tick_nohz_full_cpu(base->cpu))
- wake_up_nohz_cpu(base->cpu);
- return;
+ if (is_timers_nohz_active()) {
+ /*
+ * TODO: This wants some optimizing similar to the code
+ * below, but we will do that when we switch from push to
+ * pull for deferrable timers.
+ */
+ if (timer->flags & TIMER_DEFERRABLE) {
+ if (tick_nohz_full_cpu(base->cpu))
+ wake_up_nohz_cpu(base->cpu);
+ return;
+ }
}
- /*
- * We might have to IPI the remote CPU if the base is idle and the
- * timer is not deferrable. If the other CPU is on the way to idle
- * then it can't set base->is_idle as we hold the base lock:
- */
- if (!base->is_idle)
- return;
-
/* Check whether this is the new first expiring timer: */
if (time_after_eq(timer->expires, base->next_expiry))
return;
+ /* Update next expiry time */
+ base->next_expiry = timer->expires;
/*
- * Set the next expiry time and kick the CPU so it can reevaluate the
- * wheel:
+ * We might have to IPI the remote CPU if the base is idle and the
+ * timer is not deferrable. If the other CPU is on the way to idle
+ * then it can't set base->is_idle as we hold the base lock:
*/
- base->next_expiry = timer->expires;
- wake_up_nohz_cpu(base->cpu);
+ if (is_timers_nohz_active() && base->is_idle)
+ wake_up_nohz_cpu(base->cpu);
}
static void
@@ -1684,6 +1679,7 @@ static inline void __run_timers(struct t
while (levels--)
expire_timers(base, heads + levels);
}
+ base->next_expiry = __next_timer_interrupt(base);
base->running_timer = NULL;
raw_spin_unlock_irq(&base->lock);
}
@@ -1716,8 +1712,24 @@ void run_local_timers(void)
base++;
if (time_before(jiffies, base->clk))
return;
+ base--;
+ }
+
+ /*
+ * check for next expiry
+ *
+ * deferrable base is igonred here - it is only usable when
+ * switching from push to pull model for deferrable timers
+ */
+ raw_spin_lock_irq(&base->lock);
+ if (base->clk == base->next_expiry) {
+ raw_spin_unlock_irq(&base->lock);
+ raise_softirq(TIMER_SOFTIRQ);
+ } else {
+ base->clk++;
+ raw_spin_unlock_irq(&base->lock);
+ return;
}
- raise_softirq(TIMER_SOFTIRQ);
}
/*
---8<---
Thanks,
Anna-Maria