futex_cmpxchg_enabled breakage
From: Rich Felker
Date: Wed Aug 29 2018 - 18:22:27 EST
I just spent a number of hours helping someone track down a bug that
looks like it's some kind of futex_cmpxchg_enabled detection error on
powerpc64 (still not sure of the root cause; set_robust_list producing
-ENOSYS), and a while back I hit the same problem on sh2 due to lack
of EFAULT on nommu, leading to commit 72cc564f16ca. I think the test
(introduced way back in commit a0c1e9073ef7) is fundamentally buggy;
if anything, it should be checking for !=-ENOSYS, not ==-EFAULT.
Presumably it could also fail to produce -EFAULT if mmap_min_addr is 0
and page 0 is mapped (a bad idea, but maybe someone does it...). And
of course other nommu archs are possibly still broken.
Ultimately from an API perspective, the functionality that depends on
futex_cmpxchg_enabled is non-optional, and the current approach of
treating it as something that can be disabled via detection at runtime
is fragile and wrong.
If there are no archs that support SMP but don't provide their own
asm/futex.h (as opposed to the asm-generic one that does -ENOSYS on
SMP), the detection code should just be removed, and the SMP case in
asm-generic/futex.h should be made into #error.
If there are archs that support SMP but don't provide their own
working asm/futex.h, then asm-generic/futex.h's SMP case should be
enhanced to perform a stop-the-world IPI and then do the same thing as
the non-SMP case (disable preemption[/interrupts?], perform the
cmpxchg non-atomically).
Thoughts? Would a patch to do this be acceptable?
Rich