Re: [PATCH 5/6] rhashtable: support guaranteed successful insertion.

From: Herbert Xu
Date: Thu Mar 29 2018 - 01:22:53 EST


On Thu, Mar 29, 2018 at 08:26:21AM +1100, NeilBrown wrote:
>
> I say "astronomically unlikely", you say "probability .. is extremely
> low". I think we are in agreement here.
>
> The point remains that if an error *can* be returned then I have to
> write code to handle it and test that code. I'd rather not.

You have be able to handle errors anyway because of memory allocation
failures. Ultimately if you keep inserting you will eventually
fail with ENOMEM. So I don't see the issue with an additional
error value.

> > Even if it does happen we won't fail because we will perform
> > an immediate rehash. We only fail if it happens right away
> > after the rehash (that is, at least another 16 elements have
> > been inserted and you're trying to insert a 17th element, all
> > while the new hash table has not been completely populated),
> > which means that somebody has figured out our hash secret and
> > failing in that case makes sense.

BTW, you didn't acknowledge this bit which I think is crucial to
how likely such an error is.

> I never suggested retrying, but I would have to handle it somehow. I'd
> rather not.

...

> While I have no doubt that there are hashtables where someone could try
> to attack the hash, I am quite sure there are others where is such an
> attack is meaningless - any code which could generate the required range of
> keys, could do far worse things more easily.

Our network hashtable has to be secure against adversaries. I
understand that this may not be important to your use-case. However,
given the fact that the failure would only occur if an adversary
is present and actively attacking your hash table, I don't think
it has that much of a negative effect on your use-case either.

Of course if you can reproduce the EBUSY error without your
disable_count patch or even after you have fixed the issue I
have pointed out in your disable_count patch you can still reproduce
it then that would suggest a real bug and we would need to fix it,
for everyone.

> Yes, storing a sharded count in the spinlock table does seem like an
> appropriate granularity. However that leads me to ask: why do we have
> the spinlock table? Why not bit spinlocks in the hashchain head like
> include/linux/list_bl uses?

The spinlock table predates rhashtable. Perhaps Thomas/Eric/Dave
can elucidate this.

> I don't understand how it can ever be "too late", though I appreciate
> that in some cases "sooner" is better than "later"
> If we give up on the single atomic_t counter, then we must accept that
> the number of elements could exceed any given value. The only promise
> we can provide is that it wont exceed N% of the table size for more than
> T seconds.

Sure. However, assuming we use an estimate that is hash-based,
that *should* be fairly accurate assuming that your hash function
is working in the first place. It's completely different vs.
estimating based on a per-cpu count which could be wildly inaccurate.

Cheers,
--
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt