Re: [patch v2 11/11] x86/vdso: Implement __vdso_futex_robust_try_unlock()
From: Uros Bizjak
Date: Fri Mar 20 2026 - 03:14:53 EST
On Fri, Mar 20, 2026 at 12:25 AM Thomas Gleixner <tglx@xxxxxxxxxx> wrote:
> mov %esi,%eax // Load TID into EAX
> xor %ecx,%ecx // Set ECX to 0
> lock cmpxchg %ecx,(%rdi) // Try the TID -> 0 transition
> .Lstart:
> jnz .Lend
> movq %rcx,(%rdx) // Clear list_op_pending
> .Lend:
> ret
[...]
> + * Assembly template for the try unlock functions. The basic functionality is:
> + *
> + * mov esi, %eax Move the TID into EAX
> + * xor %ecx, %ecx Clear ECX
> + * lock_cmpxchgl %ecx, (%rdi) Attempt the TID -> 0 transition
> + * .Lcs_start: Start of the critical section
> + * jnz .Lcs_end If cmpxchl failed jump to the end
> + * .Lcs_success: Start of the success section
> + * movq %rcx, (%rdx) Set the pending op pointer to 0
> + * .Lcs_end: End of the critical section
> + *
> + * .Lcs_start and .Lcs_end establish the critical section range. .Lcs_success is
> + * technically not required, but there for illustration, debugging and testing.
> + *
> + * When CONFIG_COMPAT is enabled then the 64-bit VDSO provides two functions.
> + * One for the regular 64-bit sized pending operation pointer and one for a
> + * 32-bit sized pointer to support gaming emulators.
> + *
> + * The 32-bit VDSO provides only the one for 32-bit sized pointers.
> + */
> +#define __stringify_1(x...) #x
> +#define __stringify(x...) __stringify_1(x)
> +
> +#define LABEL(name, which) __stringify(name##_futex_try_unlock_cs_##which:)
> +
> +#define JNZ_END(name) "jnz " __stringify(name) "_futex_try_unlock_cs_end\n"
> +
> +#define CLEAR_POPQ "movq %[zero], %a[pop]\n"
> +#define CLEAR_POPL "movl %k[zero], %a[pop]\n"
> +
> +#define futex_robust_try_unlock(name, clear_pop, __lock, __tid, __pop) \
> +({ \
> + asm volatile ( \
> + " \n" \
> + " lock cmpxchgl %k[zero], %a[lock] \n" \
> + " \n" \
> + LABEL(name, start) \
> + " \n" \
> + JNZ_END(name) \
> + " \n" \
> + LABEL(name, success) \
> + " \n" \
> + clear_pop \
> + " \n" \
> + LABEL(name, end) \
> + : [tid] "+&a" (__tid) \
> + : [lock] "D" (__lock), \
> + [pop] "d" (__pop), \
> + [zero] "S" (0UL) \
[zero] represents an internal register, so the above constraint can be
"r" (*). If it remains a hard register constraint (%rsi) for some
reason then the above two comments should be updated to reflect the
new constraint.
With the constraint changed to "r":
Acked-by: Uros Bizjak <ubizjak@xxxxxxxxx> (for asm template)
(*) "r" allows the compiler some more freedom. The compiler tracks the
values in registers, so it can reuse zero from an unrelated register
without moving it to the %rsi and without clobbering the source
register. In non-trivial functions, there is a high chance that needed
value is already available in some register.
Uros.