On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
On 06/15/2016 10:19 PM, Boqun Feng wrote:But a control dependency only orders LOAD->STORE pairs, right? And here
On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:If you look into the actual code:
On 06/15/2016 04:04 AM, Boqun Feng wrote:But I'm afraid the barrier doesn't remain if we replace xchg() with
Hi Waiman,The change on the unlock side is more for documentation purpose than is
On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
The osq_lock() and osq_unlock() function may not provide the necessarySo we still use WRITE_ONCE() rather than smp_store_release() here?
acquire and release barrier in some cases. This patch makes sure
that the proper barriers are provided when osq_lock() is successful
or when osq_unlock() is called.
Signed-off-by: Waiman Long<Waiman.Long@xxxxxxx>
---
kernel/locking/osq_lock.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..7dd4ee5 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
* cmpxchg in an attempt to undo our queueing.
*/
- while (!READ_ONCE(node->locked)) {
+ while (!smp_load_acquire(&node->locked)) {
/*
* If we need to reschedule bail... so we can block.
*/
@@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
* Second most likely case.
*/
node = this_cpu_ptr(&osq_node);
- next = xchg(&node->next, NULL);
+ next = xchg_release(&node->next, NULL);
if (next) {
WRITE_ONCE(next->locked, 1);
Though, IIUC, This is fine for all the archs but ARM64, because there
will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
Not sure whether it's a problem on ARM64, but I think we certainly need
to add some comments here, if we count on this trick.
Am I missing something or misunderstanding you here?
Regards,
Boqun
actually needed. As you had said, the xchg() call has provided the necessary
memory barrier. Using the _release variant, however, may have some
xchg_release() on ARM64v8, IIUC, xchg_release() is just a ldxr+stlxr
loop with no barrier on ARM64v8. This means the following code:
CPU 0 CPU 1 (next)
======================== ==================
WRITE_ONCE(x, 1); r1 = smp_load_acquire(next->locked, 1);
xchg_release(&node->next, NULL); r2 = READ_ONCE(x);
WRITE_ONCE(next->locked, 1);
could result in (r1 == 1&& r2 == 0) on ARM64v8, IIUC.
next = xchg_release(&node->next, NULL);
if (next) {
WRITE_ONCE(next->locked, 1);
return;
}
There is a control dependency that WRITE_ONCE() won't happen until
the control dependency orders the LOAD part of xchg_release() and the
WRITE_ONCE().
Along with the fact that RELEASE only orders the STORE part of xchg with
the memory operations preceding the STORE part, so for the following
code:
WRTIE_ONCE(x,1);
next = xchg_release(&node->next, NULL);
if (next)
WRITE_ONCE(next->locked, 1);
such a reordering is allowed to happen on ARM64v8
next = ldxr [&node->next] // LOAD part of xchg_release()
if (next)
WRITE_ONCE(next->locked, 1);
WRITE_ONCE(x,1);
stlxr NULL [&node->next] // STORE part of xchg_releae()
Am I missing your point here?
Regards,
Boqun