Re: [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()

From: Peter Zijlstra
Date: Fri Oct 25 2024 - 04:02:18 EST


On Tue, Oct 01, 2024 at 01:45:57PM +0200, Uros Bizjak wrote:
> Replace this pattern in osq_unlock():
>
> atomic_cmpxchg(*ptr, old, new) == old
>
> ... with the simpler and faster:
>
> atomic_try_cmpxchg(*ptr, &old, new)
>
> The x86 CMPXCHG instruction returns success in the ZF flag,
> so this change saves a compare after the CMPXCHG. The code
> in the fast path of osq_unlock() improves from:
>
> 11b: 31 c9 xor %ecx,%ecx
> 11d: 8d 50 01 lea 0x1(%rax),%edx
> 120: 89 d0 mov %edx,%eax
> 122: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
> 126: 39 c2 cmp %eax,%edx
> 128: 75 05 jne 12f <...>
>
> to:
>
> 12b: 31 d2 xor %edx,%edx
> 12d: 83 c0 01 add $0x1,%eax
> 130: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
> 134: 75 05 jne 13b <...>
>
> Signed-off-by: Uros Bizjak <ubizjak@xxxxxxxxx>

Thanks!