Re: [rfc] [patch 1/2 ] Process private hash tables for private futexes

From: Eric Dumazet
Date: Tue Mar 24 2009 - 01:32:40 EST


Ravikiran G Thirumalai a écrit :
> On Mon, Mar 23, 2009 at 10:57:55PM +0100, Eric Dumazet wrote:
>> Ravikiran G Thirumalai a écrit :
> [...]
>>> Hmm! How about
>>> a) Reduce hash table size for private futex hash and increase hash table
>>> size for the global hash?
>>>
>>> OR, better,
>>>
>>> b) Since it is multiple spinlocks on the same cacheline which is a PITA
>>> here, how about keeping the global table, but just add a dereference
>>> to each hash slot, and interleave the adjacent hash buckets between
>>> nodes/cpus? So even without needing to lose out memory from padding,
>>> we avoid false sharing on cachelines due to unrelated futexes hashing
>>> onto adjacent hash buckets?
>>>
>> Because of jhash(), futex slots are almost random. No need to try to interleave
>> them. Since you have a "cache line" of 4096 bytes, you need almost 4 pages
>> per cpu to avoid in a statistical way false sharing.
>
> How did you come up with that number? So there is no way adjacent
> cachelines will never ever be used in the global hash??
>
>

There is no way, unless you use one chain per 4096 bytes block, and that
would be a waste of memory.

Its about statistics (assuming jhash gives us normally distributed values),
not hard and guaranted constraints.

http://en.wikipedia.org/wiki/Standard_deviation

For a particular process, you can carefully chose (knowing hash function
in use by kernel), user land futexes that will *all* be hashed in a *single*
hash chain, protected by a single spinlock.

This process will then suffer from contention, using private futexes
and private or shared hash table.

Even without knowing hash function, there is still a small chance that it
can happen.

Please note that the hash table size has several purposes, first one being storing
potentially many futexes (say 30000 threads are waiting on different futexes),
and being able to avoid cache line ping pongs or misses. This is why I chose
at least 256 slots per cpu, and not 4 cache lines per cpu.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/