[PATCH 3/4] locking: Introduce smp_cond_acquire()

From: Peter Zijlstra
Date: Thu Dec 03 2015 - 07:46:49 EST


Introduce smp_cond_acquire() which combines a control dependency and a
read barrier to form acquire semantics.

This primitive has two benefits:
- it documents control dependencies,
- its typically cheaper than using smp_load_acquire() in a loop.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
include/linux/compiler.h | 17 +++++++++++++++++
kernel/locking/qspinlock.c | 3 +--
kernel/sched/core.c | 8 +-------
kernel/sched/sched.h | 2 +-
4 files changed, 20 insertions(+), 10 deletions(-)

--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -299,6 +299,23 @@ static __always_inline void __write_once
__u.__val; \
})

+/**
+ * smp_cond_acquire() - Spin wait for cond with ACQUIRE ordering
+ * @cond: boolean expression to wait for
+ *
+ * Equivalent to using smp_load_acquire() on the condition variable but employs
+ * the control dependency of the wait to reduce the barrier on many platforms.
+ *
+ * The control dependency provides a LOAD->STORE order, the additional RMB
+ * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order,
+ * aka. ACQUIRE.
+ */
+#define smp_cond_acquire(cond) do { \
+ while (!(cond)) \
+ cpu_relax(); \
+ smp_rmb(); /* ctrl + rmb := acquire */ \
+} while (0)
+
#endif /* __KERNEL__ */

#endif /* __ASSEMBLY__ */
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -446,8 +446,7 @@ void queued_spin_lock_slowpath(struct qs
if ((val = pv_wait_head_or_lock(lock, node)))
goto locked;

- while ((val = smp_load_acquire(&lock->val.counter)) & _Q_LOCKED_PENDING_MASK)
- cpu_relax();
+ smp_cond_acquire(!((val = atomic_read(&lock->val)) & _Q_LOCKED_PENDING_MASK));

locked:
/*
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1979,19 +1979,13 @@ try_to_wake_up(struct task_struct *p, un
/*
* If the owning (remote) cpu is still in the middle of schedule() with
* this task as prev, wait until its done referencing the task.
- */
- while (p->on_cpu)
- cpu_relax();
- /*
- * Combined with the control dependency above, we have an effective
- * smp_load_acquire() without the need for full barriers.
*
* Pairs with the smp_store_release() in finish_lock_switch().
*
* This ensures that tasks getting woken will be fully ordered against
* their previous state and preserve Program Order.
*/
- smp_rmb();
+ smp_cond_acquire(!p->on_cpu);

p->sched_contributes_to_load = !!task_contributes_to_load(p);
p->state = TASK_WAKING;
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1079,7 +1079,7 @@ static inline void finish_lock_switch(st
* In particular, the load of prev->state in finish_task_switch() must
* happen before this.
*
- * Pairs with the control dependency and rmb in try_to_wake_up().
+ * Pairs with the smp_cond_acquire() in try_to_wake_up().
*/
smp_store_release(&prev->on_cpu, 0);
#endif


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/