Re: [PATCH 1/4] sched/wakeup: Strengthen current_save_and_set_rtlock_wait_state()

From: Peter Zijlstra
Date: Thu Sep 09 2021 - 10:44:49 EST


On Thu, Sep 09, 2021 at 02:45:24PM +0100, Will Deacon wrote:
> On Thu, Sep 09, 2021 at 12:59:16PM +0200, Peter Zijlstra wrote:
> > While looking at current_save_and_set_rtlock_wait_state() I'm thinking
> > it really ought to use smp_store_mb(), because something like:
> >
> > current_save_and_set_rtlock_wait_state();
> > for (;;) {
> > if (try_lock())
> > break;
> >
> > raw_spin_unlock_irq(&lock->wait_lock);
> > schedule();
> > raw_spin_lock_irq(&lock->wait_lock);
> >
> > set_current_state(TASK_RTLOCK_WAIT);
> > }
> > current_restore_rtlock_saved_state();
> >
> > which is the advertised usage in the comment, is actually broken,
> > since trylock() will only need a load-acquire in general and that
> > could be re-ordered against the state store, which could lead to a
> > missed wakeup -> BAD (tm).
>
> Why doesn't the UNLOCK of pi_lock in current_save_and_set_rtlock_wait_state()
> order the state change before the successful try_lock? I'm just struggling
> to envisage how this actually goes wrong.

Moo yes, so the earlier changelog I wrote was something like:

current_save_and_set_rtlock_wait_state();
for (;;) {
if (try_lock())
break;

raw_spin_unlock_irq(&lock->wait_lock);
if (!cond)
schedule();
raw_spin_lock_irq(&lock->wait_lock);

set_current_state(TASK_RTLOCK_WAIT);
}
current_restore_rtlock_saved_state();

which is more what the code looks like before these patches, and in that
case the @cond load can be lifted before __state.

It all sorta works in the current application because most things are
serialized by ->wait_lock, but given the 'normal' wait pattern I got
highly suspicious of there not being a full barrier around.