Re: [RT PATCH 1/3] hrtimer: Use READ_ONCE to access timer->base in hrimer_grab_expiry_lock()

From: Eric Dumazet
Date: Thu Sep 26 2019 - 17:47:58 EST




On 8/21/19 6:50 AM, Thomas Gleixner wrote:
> On Wed, 21 Aug 2019, Sebastian Andrzej Siewior wrote:
>
>> On 2019-08-21 10:24:07 [+0100], Julien Grall wrote:
>>> The update to timer->base is protected by the base->cpu_base->lock().
>>> However, hrtimer_grab_expirty_lock() does not access it with the lock.
>>>
>>> So it would theorically be possible to have timer->base changed under
>>> our feet. We need to prevent the compiler to refetch timer->base so the
>>> check and the access is performed on the same base.
>>
>> It is not a problem if the timer's bases changes. We get here because we
>> want to help the timer to complete its callback.
>> The base can only change if the timer gets re-armed on another CPU which
>> means is completed callback. In every case we can cancel the timer on
>> the next iteration.
>
> It _IS_ a problem when the base changes and the compiler reloads
>
> CPU0 CPU1
> base = timer->base;
>
> lock(base->....);
> switch base
>
> reload
> base = timer->base;
>
> unlock(base->....);
>

It seems we could hit a similar problem in lock_hrtimer_base()

base = timer->base;

if (likely(base != &migration_base)) {

<reload base : could point to &migration_base>

raw_spin_lock_irqsave(&base->cpu_base->lock, *flags);

Probably not a big deal, since migration_base-cpu_base->lock can be locked just fine,
(without lockdep complaining that the lock has not been initialized since we use raw_ variant),
but this could cause unnecessary false sharing.


diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 0d4dc241c0fb498036c91a571e65cb00f5d19ba6..fa881c03e0a1a351186a8d8f798dd7471067a951 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -164,7 +164,7 @@ struct hrtimer_clock_base *lock_hrtimer_base(const struct hrtimer *timer,
struct hrtimer_clock_base *base;

for (;;) {
- base = timer->base;
+ base = READ_ONCE(timer->base);
if (likely(base != &migration_base)) {
raw_spin_lock_irqsave(&base->cpu_base->lock, *flags);
if (likely(base == timer->base))