[RFC PATCH 0/2] futex: how to solve the robust_list race condition?
From: André Almeida
Date: Fri Feb 20 2026 - 15:27:55 EST
During LPC 2025, I presented a session about creating a new syscall for
robust_list[0][1]. However, most of the session discussion wasn't much related
to the new syscall itself, but much more related to an old bug that exists in
the current robust_list mechanism.
Since at least 2012, there's an open bug reporting a race condition, as
Carlos O'Donell pointed out:
"File corruption race condition in robust mutex unlocking"
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
To help understand the bug, I've created a reproducer (patch 1/2) and a
companion kernel hack (patch 2/2) that helps to make the race condition
more likely. When the bug happens, the reproducer shows a message
comparing the original memory with the corrupted one:
"Memory was corrupted by the kernel: 8001fe8d8001fe8d vs 8001fe8dc0000000"
I'm not sure yet what would be the appropriated approach to fix it, so I
decided to reach the community before moving forward in some direction.
One suggestion from Peter[2] resolves around serializing the mmap() and the
robust list exit path, which might cause overheads for the common case,
where list_op_pending is empty.
However, giving that there's a new interface being prepared, this could
also give the opportunity to rethink how list_op_pending works, and get
rid of the race condition by design.
Feedback is very much welcome.
Thanks!
André
[0] https://lore.kernel.org/lkml/20251122-tonyk-robust_futex-v6-0-05fea005a0fd@xxxxxxxxxx/
[1] https://lpc.events/event/19/contributions/2108/
[2] https://lore.kernel.org/lkml/20241219171344.GA26279@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
André Almeida (2):
futex: Create reproducer for robust_list race condition
futex: Add debug delays
kernel/futex/core.c | 10 +++
robust_bug.c | 178 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 188 insertions(+)
create mode 100644 robust_bug.c
--
2.53.0