Re: [RFC PATCH 00/13] Add futex2 syscalls
From: Andrey Semashev
Date: Tue Feb 16 2021 - 07:14:19 EST
Sorry for posting out-of-tree, I just subscribed to the list to reply to
a post that was already sent.
André Almeida wrote:
** "And what's about FUTEX_64?"
By supporting 64 bit futexes, the kernel structure for futex would
need to have a 64 bit field for the value, and that could defeat one of
the purposes of having different sized futexes in the first place:
supporting smaller ones to decrease memory usage. This might be
something that could be disabled for 32bit archs (and even for
CONFIG_BASE_SMALL).
Which use case would benefit for FUTEX_64? Does it worth the trade-offs?
I strongly believe that 64-bit futex must be supported. I have a few use
cases in mind:
1. Cooperative robust futexes.
I have a real-world case where multiple processes need to communicate
via shared memory and synchronize via a futex. The processes run under a
supervisor parent process, which can detect termination of its children
and also has access to the shared memory. In order to make the
communication more or less safe in face of one of the child process
crashing, the futex currently contains a portion of pid of the process
that locked it. The parent supervisor is then able to tell that the
crashed child was holding the futex locked and then marke the futex as
"broken" and notify any other threads blocked on it.
Given that pid can be up to 32-bits in size, and we also need some bits
in the futex to implement its logic (i.e. at least "locked" and "broken"
bits, some bits for the ABA counter, etc.), the pid can be truncated and
the above logic may be broken. In the real application, only 15 bits are
left for the pid, which is already less than the actual pid range on the
system.
Note: We're not using the proper pthread robust mutexes because we also
need a condition variable, and condition variables contain a non-robust
mutex internally, which basically nullifies robustness. One could argue
to fix pthread instead, but I view that as a more difficult task as
pthread interface is standardized. We would rather use futex directly
anyway because of more flexibility and less performance overhead.
2. Parity with WaitOnAddress[1] on Windows.
WaitOnAddress is explicitly documented to support 8-byte states, and its
interface allows for further extension. I'm not a Wine developer, but I
would guess that having a 8-byte futex support to match would be useful
there.
Besides Wine, having a 64-bit futex would be important for
std::atomic[2] and Boost.Atomic in C++, which support waiting and
notifying operations (for std::atomic, introduced in C++20). Waiting and
notifying operations are normally implemented using futex API on Linux
and WaitOnAddress on Windows, and can be emulated with a process-wide
global mutex pool if such API is unavailable for a given atomic size on
the target platform. This means that 64-bit atomics on Linux currently
must be implemented with a lock and therefore cannot be used in
process-shared memory, while there is no such limitation on Windows.
I'm not sure how much memory is saved by not having 64-bit state in the
kernel futex structures, but this doesn't look like a huge deal on
modern systems - server, desktop or mobile. It may make sense for
extremely low memory embedded systems, and for those targets the support
may be disabled with a switch. In fact, such systems would probably not
support 64-bit atomics anyway. For any other targets I would prefer
64-bit futex to be available by default.
My main issue with 64-bit being optional though is that applications and
libraries like Boost.Atomic would like (or even require) to know if the
feature is available at compile time rather than run time. std::atomic,
for example, is supposed to be a thin abstraction over atomic
instructions and OS primitives like futex, so performing runtime
detection of the available features in the kernel would be detrimental
there. I'm not sure if this is possible in the current kernel
infrastructure, but it would be best if the lack of 64-bit atomics in
the kernel was detectable through kernel headers (e.g. by a macro for
64-bit futexes not being defined or something like that), which means
the headers must be generated at kernel configuration time.
[1]:
https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitonaddress
[2]: https://en.cppreference.com/w/cpp/atomic/atomic