Re: [RFC PATCH 1/2] futex: Create reproducer for robust_list race condition

From: André Almeida

Date: Thu Mar 12 2026 - 09:41:57 EST


Em 12/03/2026 06:04, Sebastian Andrzej Siewior escreveu:
On 2026-02-20 17:26:19 [-0300], André Almeida wrote:
--- /dev/null
+++ b/robust_bug.c

+ new->value = ((uint64_t) value << 32) + value;
+
+ /* Create a backup of the current value */
+ original_val = new->value;

Now that I finally got it and I might have understood the issue.

You exit before unlocking the futex. You free this block and this new
memory (address) is the same as the old one. Your corruption comes from
the fact that the old content is the same as the new content.

If the thread does unlock in userland (or kernel) but the lock remains
on the robust_list while it gets killed then the kernel will attempt to
unlock the lock. But this requires that the futex value matches the
value.
So if it is unlocked (0x0) or used again then nothing happens. Unless
the new memory gets the same value assigned as the pid value by
accident. Then it gets changed…

If the unlock did not happen and is still owned by the thread, that is
killed, then the "fixup" here is the right thing to do. The memory
should not be free()ed because the lock was still owned by the thread.
The misunderstanding here might be "once the thread is gone, the lock is
free we can throw away the memory". At the very least, it was a locked
mutex and I think pthread_mutex_destroy() would complain here.

So is the issue here that the "new" value is the same as the "old" value
and the robust-death-handle part in the kernel does its job? Or did I
over simplify something?
Let me continue with the thread…


Yes, this is exactly what I understood as well.

User thread A releases the lock, but exits before setting op_pending = NULL. Thread B can free the lock after using it, and by chance needs to use the same value as the PID in the same memory. Then thread A do the robust list handle inside the kernel and the corruption happens.