Re: [PATCH 5/6] rhashtable: support guaranteed successful insertion.

From: Herbert Xu
Date: Wed Mar 28 2018 - 03:27:47 EST


On Wed, Mar 28, 2018 at 06:04:40PM +1100, NeilBrown wrote:
>
> I disagree. My patch 6 only makes it common instead of exceedingly
> rare. If any table in the list other than the first has a chain with 16
> elements, then trying to insert an element with a hash which matches
> that chain will fail with -EBUSY. This is theoretically possible
> already, though astronomically unlikely. So that case will never be
> tested for.

No that's not true. If the table is correctly sized then the
probability of having a chain with 16 elements is extremely low.

Even if it does happen we won't fail because we will perform
an immediate rehash. We only fail if it happens right away
after the rehash (that is, at least another 16 elements have
been inserted and you're trying to insert a 17th element, all
while the new hash table has not been completely populated),
which means that somebody has figured out our hash secret and
failing in that case makes sense.

> It is hard to know if it is necessary. And making the new table larger
> will make the error less likely, but still won't make it impossible. So
> callers will have to handle it - just like they currently have to handle
> -ENOMEM even though it is highly unlikely (and not strictly necessary).

Callers should not handle an ENOMEM error by retrying. Nor should
they retry an EBUSY return value.

> Are these errors ever actually useful? I thought I had convinced myself
> before that they were (to throttle attacks on the hash function), but
> they happen even less often than I thought.

The EBUSY error indicates that the hash table has essentially
degenereated into a linked list because somebody has worked out
our hash secret.

> Maybe. Reading a percpu counter isn't cheap. Reading it whenever a hash
> chain reaches 16 is reasonable, but I think we would want to read it a
> lot more often than that. So probably store the last-sampled time (with
> no locking) and only sample the counter if last-sampled is more than
> jiffies - 10*HZ (???)

We could also take the spinlock table approach and have a counter
per bucket spinlock. This should be sufficient as you'll contend
on the bucket spinlock table anyway.

This also allows us to estimate the total table size and not have
to always do a last-ditch growth when it's too late.

Cheers,
--
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt