Re: [PATCH 3/4] locking/osq_lock: From x_osq_lock/unlock to numa-aware lock/unlock.

From: Uros Bizjak
Date: Sun Sep 15 2024 - 14:44:46 EST




On 14. 09. 24 10:53, yongli-oc wrote:
According to the contention level, switches from x_osq_lock to
numa-aware osq_lock.
The numa-aware lock is a two level osq_lock.
The Makefile for dynamic numa-aware osq lock.

Signed-off-by: yongli-oc <yongli-oc@xxxxxxxxxxx>
---
kernel/locking/Makefile | 1 +
kernel/locking/numa.h | 98 ++++++++
kernel/locking/numa_osq.h | 32 +++
kernel/locking/x_osq_lock.c | 332 +++++++++++++++++++++++++++
kernel/locking/zx_numa_osq.c | 433 +++++++++++++++++++++++++++++++++++
5 files changed, 896 insertions(+)
create mode 100644 kernel/locking/numa.h
create mode 100644 kernel/locking/numa_osq.h
create mode 100644 kernel/locking/x_osq_lock.c
create mode 100644 kernel/locking/zx_numa_osq.c

...

+ if (lock->numa_enable == OSQLOCKSTOPPING && old == OSQ_UNLOCKED_VAL)
+ old = OSQ_LOCKED_VAL;
+
+ for (;;) {
+ if (READ_ONCE(lock->tail16) == curr &&
+ cmpxchg(&lock->tail16, curr, old) == curr) {

I would like to ask if there is any benefit to read the location two times? cmpxchg() reads the location and skips the update when curr is different from the value at the location by itself. Using try_cmpxchg() can produce even more optimized code, since on x86 arch CMPXCHG also sets ZF flag when operand 2 is equal to the value at location (and update happens), and this flag can be used in a conditional jump.

The above condition could read:

if (try_cmpxchg(&lock->tail16, &curr, old)) {

with an optional temporary variable instead of curr, because try_cmpxchg() updates its second argument:

+ for (;;) {
+ u16 tmp = curr;
+ if (try_cmpxchg(&lock->tail16, &tmp, old)) {

Please note that the above form could produce slightly more optimized code also on non-x86 targets.

Uros.