Re: futex_cmpxchg_enabled breakage

From: Thomas Gleixner
Date: Thu Aug 30 2018 - 05:20:06 EST


On Wed, 29 Aug 2018, Rich Felker wrote:

> I just spent a number of hours helping someone track down a bug that
> looks like it's some kind of futex_cmpxchg_enabled detection error on
> powerpc64 (still not sure of the root cause; set_robust_list producing
> -ENOSYS), and a while back I hit the same problem on sh2 due to lack
> of EFAULT on nommu, leading to commit 72cc564f16ca. I think the test
> (introduced way back in commit a0c1e9073ef7) is fundamentally buggy;
> if anything, it should be checking for !=-ENOSYS, not ==-EFAULT.

Errm? This does a futex_cmpxchg() on NULL which has to return EFAULT if
it's available. There is nothing fundamentally buggy about it at all.

> Presumably it could also fail to produce -EFAULT if mmap_min_addr is 0
> and page 0 is mapped (a bad idea, but maybe someone does it...). And
> of course other nommu archs are possibly still broken.

If NULL is mapped in the kernel then a lot of other things are broken. The
futex thing is then the least of your worries.

> Ultimately from an API perspective, the functionality that depends on
> futex_cmpxchg_enabled is non-optional, and the current approach of
> treating it as something that can be disabled via detection at runtime
> is fragile and wrong.

The availibility of the interfaces which depend on futex_cmpxchg_enabled
has been runtime detectable forever and it's documented that way. I have no
idea why you think it's non-optional. If you made it unconditional in your
lib, then it's hardly the kernels problem.

> If there are no archs that support SMP but don't provide their own
> asm/futex.h (as opposed to the asm-generic one that does -ENOSYS on
> SMP), the detection code should just be removed, and the SMP case in
> asm-generic/futex.h should be made into #error.

And why so? Just because?

> If there are archs that support SMP but don't provide their own
> working asm/futex.h, then asm-generic/futex.h's SMP case should be
> enhanced to perform a stop-the-world IPI and then do the same thing as
> the non-SMP case (disable preemption[/interrupts?], perform the
> cmpxchg non-atomically).
>
> Thoughts? Would a patch to do this be acceptable?

No. There is nothing at all in the world which requires that PI futexes and
robust list are provided and even if you implement that hack in the kernel
then the user space side still does not work because the user space part of
those interfaces has a hard dependency on working cmpxchg as well.

Thanks,

tglx