From: Eric Dumazet <edumazet@xxxxxxxxxx>
On Mon, 2015-08-17 at 16:25 +0200, Sander Eikelenboom wrote:
Monday, August 17, 2015, 4:21:47 PM, you wrote:
> On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote:
>> This is very similar to the behavior I am seeing in this bug:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=102911
> OK, but have you applied the fix ?
> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af
> It will be part of net iteration from David Miller to Linus Torvald.
I did have that patch in for my last report.
But i don't think he had (looking at the second part of his oops).
Then can you try following fix as well ?
Thanks !
--
[PATCH] timer: fix a race in __mod_timer()
lock_timer_base() can not catch following :
CPU1 ( in __mod_timer()
timer->flags |= TIMER_MIGRATING;
spin_unlock(&base->lock);
base = new_base;
spin_lock(&base->lock);
timer->flags &= ~TIMER_BASEMASK;
CPU2 (in lock_timer_base())
see timer base is cpu0 base
spin_lock_irqsave(&base->lock, *flags);
if (timer->flags == tf)
return base; // oops, wrong base
timer->flags |= base->cpu // too late
We must write timer->flags in one go, otherwise we can fool other cpus.
Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled")
Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
kernel/time/timer.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 5e097fa9faf7..84190f02b521 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
spin_unlock(&base->lock);
base = new_base;
spin_lock(&base->lock);
- timer->flags &= ~TIMER_BASEMASK;
- timer->flags |= base->cpu;
+ WRITE_ONCE(timer->flags,
+ (timer->flags & ~TIMER_BASEMASK) | base->cpu);
}
}