Re: [patch 1/3] timers: raise timer softirq on __mod_timer/add_timer_on
From: Marcelo Tosatti
Date: Thu May 30 2019 - 16:19:24 EST
Hi Anna-Maria,
On Wed, May 29, 2019 at 04:53:05PM +0200, Anna-Maria Gleixner wrote:
> On Mon, 15 Apr 2019, Marcelo Tosatti wrote:
>
> [...]
>
> > The patch "timers: do not raise softirq unconditionally" from Thomas
> > attempts to address that by checking, in the sched tick, whether its
> > necessary to raise the timer softirq.
https://lore.kernel.org/patchwork/patch/446045/
>> Unfortunately, it attempts to grab
> > the tvec base spinlock which generates the issue described in the patch
> > "Revert "timers: do not raise softirq unconditionally"".
https://lore.kernel.org/patchwork/patch/552474/
> Both patches are not available in the version your patch set is based
> on. Better pointers would be helpful.
See above.
>
> > tvec_base->lock protects addition of timers to the wheel versus
> > timer interrupt execution.
>
> The timer_base->lock (formally known as tvec_base->lock), synchronizes all
> accesses to timer_base and not only addition of timers versus timer
> interrupt execution. Deletion of timers, getting the next timer interrupt,
> forwarding the base clock and migration of timers are protected as well by
> timer_base->lock.
Right.
> > This patch does not grab the tvec base spinlock from irq context,
> > but rather performs a lockless access to base->pending_map.
>
> I cannot see where this patch performs a lockless access to
> timer_base->pending_map.
[patch 2/3] timers: do not raise softirq unconditionally (spinlockless
version)
> > It handles the the race between timer addition and timer interrupt
> > execution by unconditionally (in case of isolated CPUs) raising the
> > timer softirq after making sure the updated bitmap is visible
> > on remote CPUs.
>
> So after modifying a timer on a non housekeeping timer base, the timer
> softirq is raised - even if there is no pending timer in the next
> bucket. Only with this patch, this shouldn't be a problem - but it is an
> additional raise of timer softirq and an overhead when adding a timer,
> because the normal timer softirq is raised from sched tick anyway.
It should be clear why this is necessary when reading
[patch 2/3] timers: do not raise softirq unconditionally (spinlockless
version)
>
> > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> >
> > ---
> > kernel/time/timer.c | 38 ++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 38 insertions(+)
> >
> > Index: linux-rt-devel/kernel/time/timer.c
> > ===================================================================
> > --- linux-rt-devel.orig/kernel/time/timer.c 2019-04-15 13:56:06.974210992 -0300
> > +++ linux-rt-devel/kernel/time/timer.c 2019-04-15 14:21:02.788704354 -0300
> > @@ -1056,6 +1063,17 @@
> > internal_add_timer(base, timer);
> > }
> >
> > + if (!housekeeping_cpu(base->cpu, HK_FLAG_TIMER) &&
> > + !(timer->flags & TIMER_DEFERRABLE)) {
> > + call_single_data_t *c;
> > +
> > + c = per_cpu_ptr(&raise_timer_csd, base->cpu);
> > +
> > + /* Make sure bitmap updates are visible on remote CPUs */
> > + smp_wmb();
> > + smp_call_function_single_async(base->cpu, c);
> > + }
> > +
> > out_unlock:
> > raw_spin_unlock_irqrestore(&base->lock, flags);
> >
>
> Could you please explain me, why you decided to use the above
> implementation for raising the timer softirq after modifying a timer?
Because of the following race condition which is open after
"[patch 2/3] timers: do not raise softirq unconditionally (spinlockless
version)":
CPU-0 CPU-1
jiffies=99
runs
add_timer_on, with
timer->expires=100
jiffies=100
run_softirq(), sees pending bitmap clear
add_timer_on
returns and
timer was not executed
P)
This race did not exist before.
So by raising a softirq on the remote CPU
at point P), its ensured the timer will
be executed ASAP.