wake_q memory ordering

From: Manfred Spraul
Date: Thu Oct 10 2019 - 06:41:19 EST


Hi,

Waiman Long noticed that the memory barriers in sem_lock() are not really documented, and while adding documentation, I ended up with one case where I'm not certain about the wake_q code:

Questions:
- Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an
 ordering guarantee?
- Is it ok that wake_up_q just writes wake_q->next, shouldn't
 smp_store_acquire() be used? I.e.: guarantee that wake_up_process()
 happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed
 provides any ordering.

Example:
- CPU2 never touches lock a. It is just an unrelated wake_q user that also
 wants to wake up task 1234.
- I've noticed already that smp_store_acquire() doesn't exist.
 So smp_store_mb() is required. But from semantical point of view, we would
 need an ACQUIRE: the wake_up_process() must happen after cmpxchg().
- May wake_up_q() rely on the spinlocks/memory barriers in try_to_wake_up,
 or should the function be safe by itself?

CPU1: /current=1234, inside do_semtimedop()/
ÂÂÂÂÂÂÂ g_wakee = current;
ÂÂÂÂÂÂÂ current->state = TASK_INTERRUPTIBLE;
ÂÂÂÂÂÂÂ spin_unlock(a);

CPU2: / arbitrary kernel thread that uses wake_q /
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ wake_q_add(&unrelated_q, 1234);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ wake_up_q(&unrelated_q);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ <...ongoing>

CPU3: / do_semtimedop() + wake_up_sem_queue_prepare() /
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ spin_lock(a);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ wake_q_add(,g_wakee);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ < within wake_q_add() >:
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ smp_mb__before_atomic();
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return false; /* -> this happens */

CPU2:
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ <within wake_up_q>
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 1234->wake_q.next = NULL; <<<<<<<<< Ok? Is store_acquire() missing? >>>>>>>>>>>>
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ wake_up_process(1234);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ < within wake_up_process/try_to_wake_up():
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ raw_spin_lock_irqsave()
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ smp_mb__after_spinlock()
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if(1234->state = TASK_RUNNING) return;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ >


rewritten:

start condition: A = 1; B = 0;

CPU1:
ÂÂÂ B = 1;
ÂÂÂ RELEASE, unlock LockX;

CPU2:
ÂÂÂ lock LockX, ACQUIRE
ÂÂÂ if (LOAD A == 1) return; /* using cmp_xchg_relaxed */

CPU2:
ÂÂÂ A = 0;
ÂÂÂ ACQUIRE, lock LockY
ÂÂÂ smp_mb__after_spinlock();
ÂÂÂ READ B

Question: is A = 1, B = 0 possible?

--

ÂÂÂ Manfred