Re: timers: Move clearing of base::timer_running under base::lock

From: Thomas Gleixner
Date: Mon Dec 07 2020 - 09:30:34 EST


On Mon, Dec 07 2020 at 14:07, Sebastian Andrzej Siewior wrote:
> On 2020-12-06 22:40:07 [+0100], Thomas Gleixner wrote:
>> syzbot reported KCSAN data races vs. timer_base::timer_running being set to
>> NULL without holding base::lock in expire_timers().
>>
>> This looks innocent and most reads are clearly not problematic but for a
>> non-RT kernel it's completely irrelevant whether the store happens before
>> or after taking the lock. For an RT kernel moving the store under the lock
>> requires an extra unlock/lock pair in the case that there is a waiter for
>> the timer. But that's not the end of the world and definitely not worth the
>> trouble of adding boatloads of comments and annotations to the code. Famous
>> last words...
>>
>> Reported-by: syzbot+aa7c2385d46c5eba0b89@xxxxxxxxxxxxxxxxxxxxxxxxx
>> Reported-by: syzbot+abea4558531bae1ba9fe@xxxxxxxxxxxxxxxxxxxxxxxxx
>> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> One thing I noticed while testing it is that the "corner" case in
> timer_sync_wait_running() is quite reliably hit by rcu_preempt
> rcu_gp_fqs_loop() -> swait_event_idle_timeout_exclusive() invocation.

I assume it's something like this:

timeout -> wakeup

->preemption
del_timer_sync()
.....

Thanks,

tglx