[PATCH -v2 4/4] locking: Remove smp_mb__before_spinlock()

From: Peter Zijlstra
Date: Wed Aug 02 2017 - 07:44:16 EST


Now that there are no users of smp_mb__before_spinlock() left, remove
it entirely.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
Documentation/memory-barriers.txt | 5 ---
Documentation/translations/ko_KR/memory-barriers.txt | 5 ---
arch/arm64/include/asm/spinlock.h | 9 ------
arch/powerpc/include/asm/barrier.h | 7 -----
fs/userfaultfd.c | 25 ++++++++-----------
include/linux/spinlock.h | 13 ---------
6 files changed, 13 insertions(+), 51 deletions(-)

--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1981,10 +1981,7 @@ In all cases there are variants on "ACQU
ACQUIRE operation has completed.

Memory operations issued before the ACQUIRE may be completed after
- the ACQUIRE operation has completed. An smp_mb__before_spinlock(),
- combined with a following ACQUIRE, orders prior stores against
- subsequent loads and stores. Note that this is weaker than smp_mb()!
- The smp_mb__before_spinlock() primitive is free on many architectures.
+ the ACQUIRE operation has completed.

(2) RELEASE operation implication:

--- a/Documentation/translations/ko_KR/memory-barriers.txt
+++ b/Documentation/translations/ko_KR/memory-barriers.txt
@@ -1956,10 +1956,7 @@ MMIO ìê ëëì
ëì ìëëëë.

ACQUIRE ììì ììë ëëë ìíëììì ACQUIRE ìíëììì ìëë íì
- ìëë ì ììëë. smp_mb__before_spinlock() ëì ACQUIRE ê ìíëë
- ìë ëëì ëë ìì ìíìë ëë ëì ëëì ìíìì ëí ìì
- ëìëë. ìê smp_mb() ëë ìíë êìì êìíìì! ëì ìííììì
- smp_mb__before_spinlock() ì ìì ìëìë íì ììëë.
+ ìëë ì ììëë.

(2) RELEASE ìíëììì ìí:

--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -358,15 +358,6 @@ static inline int arch_read_trylock(arch
#define arch_read_relax(lock) cpu_relax()
#define arch_write_relax(lock) cpu_relax()

-/*
- * Accesses appearing in program order before a spin_lock() operation
- * can be reordered with accesses inside the critical section, by virtue
- * of arch_spin_lock being constructed using acquire semantics.
- *
- * In cases where this is problematic (e.g. try_to_wake_up), an
- * smp_mb__before_spinlock() can restore the required ordering.
- */
-#define smp_mb__before_spinlock() smp_mb()
/* See include/linux/spinlock.h */
#define smp_mb__after_spinlock() smp_mb()

--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -74,13 +74,6 @@ do { \
___p1; \
})

-/*
- * This must resolve to hwsync on SMP for the context switch path.
- * See _switch, and core scheduler context switch memory ordering
- * comments.
- */
-#define smp_mb__before_spinlock() smp_mb()
-
#include <asm-generic/barrier.h>

#endif /* _ASM_POWERPC_BARRIER_H */
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -109,27 +109,24 @@ static int userfaultfd_wake_function(wai
goto out;
WRITE_ONCE(uwq->waken, true);
/*
- * The implicit smp_mb__before_spinlock in try_to_wake_up()
- * renders uwq->waken visible to other CPUs before the task is
- * waken.
+ * The Program-Order guarantees provided by the scheduler
+ * ensure uwq->waken is visible before the task is woken.
*/
ret = wake_up_state(wq->private, mode);
- if (ret)
+ if (ret) {
/*
* Wake only once, autoremove behavior.
*
- * After the effect of list_del_init is visible to the
- * other CPUs, the waitqueue may disappear from under
- * us, see the !list_empty_careful() in
- * handle_userfault(). try_to_wake_up() has an
- * implicit smp_mb__before_spinlock, and the
- * wq->private is read before calling the extern
- * function "wake_up_state" (which in turns calls
- * try_to_wake_up). While the spin_lock;spin_unlock;
- * wouldn't be enough, the smp_mb__before_spinlock is
- * enough to avoid an explicit smp_mb() here.
+ * After the effect of list_del_init is visible to the other
+ * CPUs, the waitqueue may disappear from under us, see the
+ * !list_empty_careful() in handle_userfault().
+ *
+ * try_to_wake_up() has an implicit smp_mb(), and the
+ * wq->private is read before calling the extern function
+ * "wake_up_state" (which in turns calls try_to_wake_up).
*/
list_del_init(&wq->entry);
+ }
out:
return ret;
}
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -118,19 +118,6 @@ do { \
#endif

/*
- * Despite its name it doesn't necessarily has to be a full barrier.
- * It should only guarantee that a STORE before the critical section
- * can not be reordered with LOADs and STOREs inside this section.
- * spin_lock() is the one-way barrier, this LOAD can not escape out
- * of the region. So the default implementation simply ensures that
- * a STORE can not move into the critical section, smp_wmb() should
- * serialize it with another STORE done by spin_lock().
- */
-#ifndef smp_mb__before_spinlock
-#define smp_mb__before_spinlock() smp_wmb()
-#endif
-
-/*
* This barrier must provide two things:
*
* - it must guarantee a STORE before the spin_lock() is ordered against a