[PATCH v5 5/7] locking: Add contended_release tracepoint to qspinlock

From: Dmitry Ilvokhin

Date: Thu Apr 16 2026 - 11:07:43 EST


Use the arch-overridable queued_spin_release(), introduced in the
previous commit, to ensure the tracepoint works correctly across all
architectures, including those with custom unlock implementations (e.g.
x86 paravirt).

When the tracepoint is disabled, the only addition to the hot path is a
single NOP instruction (the static branch). When enabled, the contention
check, trace call, and unlock are combined in an out-of-line function to
minimize hot path impact, avoiding the compiler needing to preserve the
lock pointer in a callee-saved register across the trace call.

Binary size impact (x86_64, defconfig):
uninlined unlock (common case): +680 bytes (+0.00%)
inlined unlock (worst case): +83659 bytes (+0.21%)

The inlined unlock case could not be achieved through Kconfig options on
x86_64 as PREEMPT_BUILD unconditionally selects UNINLINE_SPIN_UNLOCK on
x86_64. The UNINLINE_SPIN_UNLOCK guards were manually inverted to force
inline the unlock path and estimate the worst case binary size increase.

In practice, configurations with UNINLINE_SPIN_UNLOCK=n have already
opted against binary size optimization, so the inlined worst case is
unlikely to be a concern.

Architectures with fully custom qspinlock implementations (e.g.
PowerPC) are not covered by this change.

Signed-off-by: Dmitry Ilvokhin <d@xxxxxxxxxxxx>
---
include/asm-generic/qspinlock.h | 18 ++++++++++++++++++
kernel/locking/qspinlock.c | 8 ++++++++
2 files changed, 26 insertions(+)

diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index df76f34645a0..915a4c2777f6 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -41,6 +41,7 @@

#include <asm-generic/qspinlock_types.h>
#include <linux/atomic.h>
+#include <linux/tracepoint-defs.h>

#ifndef queued_spin_is_locked
/**
@@ -129,12 +130,29 @@ static __always_inline void queued_spin_release(struct qspinlock *lock)
}
#endif

+DECLARE_TRACEPOINT(contended_release);
+
+extern void queued_spin_release_traced(struct qspinlock *lock);
+
/**
* queued_spin_unlock - unlock a queued spinlock
* @lock : Pointer to queued spinlock structure
+ *
+ * Generic tracing wrapper around the arch-overridable
+ * queued_spin_release().
*/
static __always_inline void queued_spin_unlock(struct qspinlock *lock)
{
+ /*
+ * Trace and release are combined in queued_spin_release_traced() so
+ * the compiler does not need to preserve the lock pointer across the
+ * function call, avoiding callee-saved register save/restore on the
+ * hot path.
+ */
+ if (tracepoint_enabled(contended_release)) {
+ queued_spin_release_traced(lock);
+ return;
+ }
queued_spin_release(lock);
}

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index af8d122bb649..c72610980ec7 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -104,6 +104,14 @@ static __always_inline u32 __pv_wait_head_or_lock(struct qspinlock *lock,
#define queued_spin_lock_slowpath native_queued_spin_lock_slowpath
#endif

+void __lockfunc queued_spin_release_traced(struct qspinlock *lock)
+{
+ if (queued_spin_is_contended(lock))
+ trace_contended_release(lock);
+ queued_spin_release(lock);
+}
+EXPORT_SYMBOL(queued_spin_release_traced);
+
#endif /* _GEN_PV_LOCK_SLOWPATH */

/**
--
2.52.0