Re: [PATCH 0/1] Rosebush, a new hash table

From: Herbert Xu
Date: Sun Feb 25 2024 - 00:33:05 EST


On Sun, Feb 25, 2024 at 05:01:19AM +0000, Matthew Wilcox wrote:
>
> Task a trivial example where you have four entries unevenly distributed
> between two buckets, three in one bucket and one in the other. Now 3/4
> of your lookups hit in one bucket and 1/4 in the other bucket.
> Obviously it's not as pronounced if you have 1000 buckets with 1000
> entries randomly distributed between the buckets. But that distribution
> is not nearly as even as you might expect:
>
> $ ./distrib
> 0: 362
> 1: 371
> 2: 193
> 3: 57
> 4: 13
> 5: 4

Indeed, that's why rhashtable only triggers a forced rehash at
a chain length of 16 even though we expect the average chain length
to be just 1.

The theoretical worst-case value is expected to be O(lg n/lg lg n).
However, I think 16 was picked because it was sufficient even for a
hash table that filled all memory. Of course if anyone can provide
some calculation showing that this is insufficient I'm happy to raise
the limit.

Cheers,
--
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt