Re: [patch v2 00/11] futex: Address the robust futex unlock race for real

From: Thomas Gleixner

Date: Sat Mar 28 2026 - 08:43:13 EST


On Fri, Mar 27 2026 at 12:50, Rich Felker wrote:
> On Fri, Mar 27, 2026 at 12:42:35AM -0300, André Almeida wrote:
>> So you call the vDSO first. If it fails, it means that the lock is contented
>> and you need to call futex(). It will wake a waiter, release the lock and
>> clean list_op_pending.
>
> So would we use the vdso function presence as signal that this
> functionality is available? In that case, I think what we would do is:
>
> 1. Try an uncontended unlock using the vdso.
> 2. If it fails, attempt FUTEX_ROBUST_UNLOCK.
> 3. If that fails (note: this could be due to seccomp!), fallback to
> the old kernel code path, holding off any munmap/etc. while we perform
> the userspace unlock.

FUTEX_ROBUST_UNLOCK is a flag similar to FUTEX_PRIVATE which is or'ed on
FUTEX_WAKE, FUTEX_WAKE_BITSET and FUTEX_UNLOCK_PI to tell the kernel
that it should do the unlock for FUTEX_WAKE* and the pointer clearing
for all three variants. UNLOCK_PI already does the contended unlock
today. So yeah, seccomp might refuse, but then it might refuse plain
FUTEX_WAKE* too which leaves you in a creek without a paddle.

If the kernel supports ROBUST UNLOCK, but does not expose the VDSO
function or lacks VDSO at all, you can still use the syscall for the
contended case unlock and limit your user space workaround to the
successful uncontended unlock case by using try_cmpxchg() in the library
code, which is obviously not covered by the fixup as the kernel does not
know about it.

I briefly pondered to allow user space to register the critical section
(that's how I evaluated the approach in the first place).

But that's a can of worms we should not open at all because the kernel
needs to know the registers used (to retrieve the pending op pointer)
and the condition for successful uncontended unlock. Keeping that in
sync would be a nightmare. With the VDSO that's not an issue as the
kernel can keep the changes synchronized and validate with selftests
that it actually is correct.

Thanks,

tglx