Re: [Question]: try to fix contention between expire_timers and try_to_del_timer_sync

From: Thomas Gleixner
Date: Thu Jul 27 2017 - 11:19:09 EST


On Thu, 27 Jul 2017, Will Deacon wrote:
> On Thu, Jul 27, 2017 at 09:29:20AM +0800, qiaozhou wrote:
> > On 2017å07æ26æ 22:16, Thomas Gleixner wrote:
> > >--- a/kernel/time/timer.c
> > >+++ b/kernel/time/timer.c
> > >@@ -1301,10 +1301,12 @@ static void expire_timers(struct timer_b
> > > if (timer->flags & TIMER_IRQSAFE) {
> > > raw_spin_unlock(&base->lock);
> > > call_timer_fn(timer, fn, data);
> > >+ base->running_timer = NULL;
> > > raw_spin_lock(&base->lock);
> > > } else {
> > > raw_spin_unlock_irq(&base->lock);
> > > call_timer_fn(timer, fn, data);
> > >+ base->running_timer = NULL;
> > > raw_spin_lock_irq(&base->lock);
> > > }
> > > }
> > It should work for this particular issue and I'll test it. Previously I
> > thought it was unsafe to touch base->running_timer without holding lock.
>
> I think it works out in practice because base->lock and base->running_timer
> share a cacheline, so end up being ordered correctly. We should probably be
> using READ_ONCE/WRITE_ONCE for accessing the running_time field though.
>
> One thing I don't get though, is why try_to_del_timer_sync needs to check
> base->running_timer at all. Given that it holds the base->lock, can't it
> be the person that sets it to NULL?

No. The timer callback code does:

base->running_timer = timer;
spin_unlock(base->lock);
fn(timer);
spin_lock(base->lock);
base->running_timer = NULL;

So for del_timer_sync() the only way to figure out whether the timer
callback is running is to check base->running_timer. We cannot store state
in the timer itself because we cannot clear that state when the callback
return as the timer might have been freed in the callback. Yes, that's
nasty, but reality.

Thanks,

tglx