Re: [PATCH v6 5/7] locking: Add contended_release tracepoint to qspinlock

From: Dmitry Ilvokhin

Date: Wed May 27 2026 - 09:45:08 EST


Hi Peter,

Gentle ping on this. I wanted to check if the assembly analysis in my
previous reply changed the picture at all.

You were right that the commit message was misleading about the total
size increase: it's 9 bytes per call site, not just the NOP.

That said, when I looked at the executed path with the tracepoint
disabled, the only addition is the 2-byte NOP (xchg %ax,%ax).

Both the baseline and instrumented _raw_spin_unlock() fit within a
single 64-byte cache line, and I wasn't able to measure any difference
with locktorture: lock() cost completely dominates, unlock() accounts
for less than 1% of the total, so any overhead is indistinguishable from
noise.

If the cost is still a concern, I see two possible paths forward:

1. Guard the spinlock/qrwlock instrumentation behind a Kconfig option
(disabled by default), so only kernels that explicitly opt in pay
the cost.

2. Drop the spinlock/qrwlock instrumentation entirely and keep
contended_release for sleepable locks only.

Happy to go whichever direction you prefer.