Re: [PATCH] futex: improve user space accesses

From: Josh Poimboeuf
Date: Fri Nov 22 2024 - 21:47:24 EST


On Fri, Nov 22, 2024 at 11:33:05AM -0800, Linus Torvalds wrote:
> Josh Poimboeuf reports that he got a "will-it-scale.per_process_ops 1.9%
> improvement" report for his patch that changed __get_user() to use
> pointer masking instead of the explicit speculation barrier. However,
> that patch doesn't actually work in the general case, because some (very
> bad) architecture-specific code actually depends on __get_user() also
> working on kernel addresses.
>
> A profile showed that the offending __get_user() was the futex code,
> which really should be fixed up to not use that horrid legacy case.
> Rewrite futex_get_value_locked() to use the modern user acccess helpers,
> and inline it so that the compiler not only avoids the function call for
> a few instructions, but can do CSE on the address masking.
>
> It also turns out the x86 futex functions have unnecessary barriers in
> other places, so let's fix those up too.
>
> Link: https://lore.kernel.org/all/20241115230653.hfvzyf3aqqntgp63@jpoimboe/
> Reported-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

I didn't get a chance to try to recreate the original benchmark, but
this looks obviously correct.

Reviewed-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>

--
Josh