Re: [PATCH v11 15/19] futex: Implement FUTEX2_NUMA
From: Sebastian Andrzej Siewior
Date: Mon Apr 07 2025 - 12:58:56 EST
On 2025-04-07 17:57:38 [+0200], To linux-kernel@xxxxxxxxxxxxxxx wrote:
> --- a/kernel/futex/core.c
> +++ b/kernel/futex/core.c
> @@ -332,15 +337,35 @@ __futex_hash(union futex_key *key, struct futex_private_hash *fph)
…
> + if (node == FUTEX_NO_NODE) {
> + /*
> + * In case of !FLAGS_NUMA, use some unused hash bits to pick a
> + * node -- this ensures regular futexes are interleaved across
> + * the nodes and avoids having to allocate multiple
> + * hash-tables.
> + *
> + * NOTE: this isn't perfectly uniform, but it is fast and
> + * handles sparse node masks.
> + */
> + node = (hash >> futex_hashshift) % nr_node_ids;
forgot to mention earlier: This % nr_node_ids turns into div and it is
visible in perf top while looking at __futex_hash(). We could round it
down to a power-of-two (which should be the case in my 1, 2 and 4 based
NUMA world) and then we could use AND instead.
ARM does not support NUMA or div so it is not a concern.
Maybe a fast path for 1/2/4 would make sense since it is the most common
one. In case you consider it I could run test to see how significant it
is. It might be that it pops up in "perf bench futex hash" but not be
significant in general use case. I had some hacks and those did not
improve the numbers as much as I hoped for.
> + if (!node_possible(node)) {
> + node = find_next_bit_wrap(node_possible_map.bits,
> + nr_node_ids, node);
> + }
> + }
> +
> + return &futex_queues[node][hash & futex_hashmask];
> }
>
> /**
Sebastian