Re: [PATCH v10 00/21] futex: Add support task local hash maps, FUTEX2_NUMA and FUTEX2_MPOL
From: Sebastian Andrzej Siewior
Date: Wed Mar 26 2025 - 10:01:41 EST
On 2025-03-26 18:24:37 [+0530], Shrikanth Hegde wrote:
> > Anyway. To avoid the atomic part we would need to have a per-CPU counter
> > instead of a global one and a more expensive slow path for the resize
> > since you have to sum up all the per-CPU counters and so on. Not sure it
> > is worth it.
> >
>
> resize would happen when one does prctl right? or
> it can happen automatically too?
If prctl is used once then only then. Without prctl it will start with
16 buckets once the first thread is created (so you have two threads in
total).
After that it will only increase the buckets if 4 * threads < buckets.
See futex_hash_allocate_default().
> fph is going to be on thread leader's CPU and using atomics to do
> fph->users would likely cause cacheline bouncing no?
Yes, this can happen. And since the user can even resize after using
prctl we can't avoid the inc/ dec even if we switch to custom mode.
> Not sure if this happens only due to this benchmark which doesn't actually block.
> Maybe the real life use-case this doesn't matter.
That is what I assume. You go into the kernel if the futex is occupied.
If multiple threads do this at once then the cacheline bouncing is
unfortunate.
Sebastian