Re: [PATCH v6 5/7] locking: Add contended_release tracepoint to qspinlock
From: Dmitry Ilvokhin
Date: Wed May 27 2026 - 09:45:08 EST
Hi Peter,
Gentle ping on this. I wanted to check if the assembly analysis in my
previous reply changed the picture at all.
You were right that the commit message was misleading about the total
size increase: it's 9 bytes per call site, not just the NOP.
That said, when I looked at the executed path with the tracepoint
disabled, the only addition is the 2-byte NOP (xchg %ax,%ax).
Both the baseline and instrumented _raw_spin_unlock() fit within a
single 64-byte cache line, and I wasn't able to measure any difference
with locktorture: lock() cost completely dominates, unlock() accounts
for less than 1% of the total, so any overhead is indistinguishable from
noise.
If the cost is still a concern, I see two possible paths forward:
1. Guard the spinlock/qrwlock instrumentation behind a Kconfig option
(disabled by default), so only kernels that explicitly opt in pay
the cost.
2. Drop the spinlock/qrwlock instrumentation entirely and keep
contended_release for sleepable locks only.
Happy to go whichever direction you prefer.