[PATCH v5 tip/core/locking 5/7] Documentation/memory-barriers.txt: Downgrade UNLOCK+LOCK

From: Paul E. McKenney
Date: Mon Dec 09 2013 - 20:28:40 EST


From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>

Historically, an UNLOCK+LOCK pair executed by one CPU, by one task,
or on a given lock variable has implied a full memory barrier. In a
recent LKML thread, the wisdom of this historical approach was called
into question: http://www.spinics.net/lists/linux-mm/msg65653.html,
in part due to the memory-order complexities of low-handoff-overhead
queued locks on x86 systems.

This patch therefore removes this guarantee from the documentation, and
further documents how to restore it via a new smp_mb__after_unlock_lock()
primitive.

Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Waiman Long <waiman.long@xxxxxx>
Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: Andi Kleen <andi@xxxxxxxxxxxxxx>
Cc: Michel Lespinasse <walken@xxxxxxxxxx>
Cc: Davidlohr Bueso <davidlohr.bueso@xxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: Peter Hurley <peter@xxxxxxxxxxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Arnd Bergmann <arnd@xxxxxxxx>
Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
---
Documentation/memory-barriers.txt | 51 +++++++++++++++++++++++++++++++++------
1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index a0763db314ff..efb791d33e5a 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1626,7 +1626,10 @@ for each construct. These operations all imply certain barriers:
operation has completed.

Memory operations issued before the LOCK may be completed after the LOCK
- operation has completed.
+ operation has completed. An smp_mb__before_spinlock(), combined
+ with a following LOCK, acts as an smp_wmb(). Note the "w",
+ this is smp_wmb(), not smp_mb(). The smp_mb__before_spinlock()
+ primitive is free on many architectures.

(2) UNLOCK operation implication:

@@ -1646,9 +1649,6 @@ for each construct. These operations all imply certain barriers:
All LOCK operations issued before an UNLOCK operation will be completed
before the UNLOCK operation.

- All UNLOCK operations issued before a LOCK operation will be completed
- before the LOCK operation.
-
(5) Failed conditional LOCK implication:

Certain variants of the LOCK operation may fail, either due to being
@@ -1656,9 +1656,6 @@ for each construct. These operations all imply certain barriers:
signal whilst asleep waiting for the lock to become available. Failed
locks do not imply any sort of barrier.

-Therefore, from (1), (2) and (4) an UNLOCK followed by an unconditional LOCK is
-equivalent to a full barrier, but a LOCK followed by an UNLOCK is not.
-
[!] Note: one of the consequences of LOCKs and UNLOCKs being only one-way
barriers is that the effects of instructions outside of a critical section
may seep into the inside of the critical section.
@@ -1677,6 +1674,40 @@ may occur as:

LOCK, STORE *B, STORE *A, UNLOCK

+An UNLOCK followed by a LOCK may -not- be assumed to be a full memory
+barrier because it is possible for a preceding UNLOCK to pass a later LOCK
+from the viewpoint of the CPU, but not from the viewpoint of the compiler.
+Note that deadlocks cannot be introduced by this interchange because if
+such a deadlock threatened, the UNLOCK would simply complete. If it is
+necessary for an UNLOCK-LOCK pair to produce a full barrier, the LOCK
+can be followed by an smp_mb__after_unlock_lock() invocation. This will
+produce a full barrier if either (a) the UNLOCK and the LOCK are executed
+by the same CPU or task, or (b) the UNLOCK and LOCK act on the same
+lock variable. The smp_mb__after_unlock_lock() primitive is free on
+many architectures. Without smp_mb__after_unlock_lock(), the UNLOCK
+and LOCK can cross:
+
+ *A = a;
+ UNLOCK
+ LOCK
+ *B = b;
+
+may occur as:
+
+ LOCK, STORE *B, STORE *A, UNLOCK
+
+With smp_mb__after_unlock_lock(), they cannot, so that:
+
+ *A = a;
+ UNLOCK
+ LOCK
+ smp_mb__after_unlock_lock();
+ *B = b;
+
+will always occur as:
+
+ STORE *A, UNLOCK, LOCK, STORE *B
+
Locks and semaphores may not provide any guarantee of ordering on UP compiled
systems, and so cannot be counted on in such a situation to actually achieve
anything at all - especially with respect to I/O accesses - unless combined
@@ -1903,6 +1934,7 @@ However, if the following occurs:
UNLOCK M [1]
ACCESS_ONCE(*D) = d; ACCESS_ONCE(*E) = e;
LOCK M [2]
+ smp_mb__after_unlock_lock();
ACCESS_ONCE(*F) = f;
ACCESS_ONCE(*G) = g;
UNLOCK M [2]
@@ -1920,6 +1952,11 @@ But assuming CPU 1 gets the lock first, CPU 3 won't see any of:
*F, *G or *H preceding LOCK M [2]
*A, *B, *C, *E, *F or *G following UNLOCK M [2]

+Note that the smp_mb__after_unlock_lock() is critically important
+here: Without it CPU 3 might see some of the above orderings.
+Without smp_mb__after_unlock_lock(), the accesses are not guaranteed
+to be seen in order unless CPU 3 holds lock M.
+

LOCKS VS I/O ACCESSES
---------------------
--
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/