Re: [RFC PATCH 0/3 v3] futex/sched: introduce FUTEX_SWAP operation
From: Peter Oskolkov
Date: Mon Jun 29 2020 - 17:04:12 EST
Hi Thomas, Ingo!
Do you have any comments/suggestions/objections here? FUTEX_SWAP seems
to be quite useful for fast task context switching, and several teams
at Google would like to see this capability upstreamed.
Thanks,
Peter
On Wed, Jun 24, 2020 at 11:53 AM Peter Oskolkov <posk@xxxxxxx> wrote:
>
> From: Peter Oskolkov <posk@xxxxxxxxxx>
>
> This is an RFC!
>
> As Paul Turner presented at LPC in 2013 ...
> - pdf: http://pdxplumbers.osuosl.org/2013/ocw//system/presentations/1653/original/LPC%20-%20User%20Threading.pdf
> - video: https://www.youtube.com/watch?v=KXuZi9aeGTw
>
> ... Google has developed an M:N userspace threading subsystem backed
> by Google-private SwitchTo Linux Kernel API (page 17 in the pdf referenced
> above). This subsystem provides latency-sensitive services at Google with
> fine-grained user-space control/scheduling over what is running when,
> and this subsystem is used widely internally (called schedulers or fibers).
>
> This RFC patchset is the first step to open-source this work. As explained
> in the linked pdf and video, SwitchTo API has three core operations: wait,
> resume, and swap (=switch). So this patchset adds a FUTEX_SWAP operation
> that, in addition to FUTEX_WAIT and FUTEX_WAKE, will provide a foundation
> on top of which user-space threading libraries can be built.
>
> Another common use case for FUTEX_SWAP is message passing a-la RPC
> between tasks: task/thread T1 prepares a message,
> wakes T2 to work on it, and waits for the results; when T2 is done, it
> wakes T1 and waits for more work to arrive. Currently the simplest
> way to implement this is
>
> a. T1: futex-wake T2, futex-wait
> b. T2: wakes, does what it has been woken to do
> c. T2: futex-wake T1, futex-wait
>
> With FUTEX_SWAP, steps a and c above can be reduced to one futex operation
> that runs 5-10 times faster.
>
> Patches in this patchset:
>
> Patch 1: introduce FUTEX_SWAP futex operation that,
> internally, does wake + wait. The purpose of this patch is
> to work out the API.
> Patch 2: a first rough attempt to make FUTEX_SWAP faster than
> what wake + wait can do.
> Patch 3: a selftest that can also be used to benchmark FUTEX_SWAP vs
> FUTEX_WAKE + FUTEX_WAIT.
>
> v2: fix undefined symbol error ifndef CONFIG_SMP.
> v3: rebased onto the latest tip/locking/core.
>
> Peter Oskolkov (3):
> futex: introduce FUTEX_SWAP operation
> futex/sched: add wake_up_process_prefer_current_cpu, use in FUTEX_SWAP
> selftests/futex: add futex_swap selftest
>
> include/linux/sched.h | 1 +
> include/uapi/linux/futex.h | 2 +
> kernel/futex.c | 96 ++++++--
> kernel/sched/core.c | 5 +
> kernel/sched/fair.c | 3 +
> kernel/sched/sched.h | 1 +
> .../selftests/futex/functional/.gitignore | 1 +
> .../selftests/futex/functional/Makefile | 1 +
> .../selftests/futex/functional/futex_swap.c | 209 ++++++++++++++++++
> .../selftests/futex/include/futextest.h | 19 ++
> 10 files changed, 322 insertions(+), 16 deletions(-)
> create mode 100644 tools/testing/selftests/futex/functional/futex_swap.c
>
> --
> 2.25.1
>