Re: [PATCH net-next] rhashtable: further improve stability of rhashtable_walk
Date: Sun Dec 09 2018 - 17:51:01 EST
On Fri, Dec 07 2018, Herbert Xu wrote:
> On Wed, Dec 05, 2018 at 02:51:02PM +1100, NeilBrown wrote:
>> If the sequence:
>> obj = rhashtable_walk_next(iter);
>> rhashtable_remove_fast(ht, &obj->head, params);
>> races with another thread inserting or removing
>> an object on the same hash chain, a subsequent
>> rhashtable_walk_next() is not guaranteed to get the "next"
>> object. It is possible that an object could be
>> repeated, or missed.
>> This can be made more reliable by keeping the objects in a hash chain
>> sorted by memory address. A subsequent rhashtable_walk_next()
>> call can reliably find the correct position in the list, and thus
>> find the 'next' object.
>> It is not possible to take this approach with an rhltable as keeping
>> the hash chain in order is not so easy. When the first object with a
>> given key is removed, it is replaced in the chain with the next
>> object with the same key, and the address of that object may not be
>> correctly ordered.
>> I have not yet found any way to achieve the same stability
>> with rhltables, that doesn't have a major impact on lookup
>> or insert. No code currently in Linux would benefit from
>> such extra stability.
>> With this patch:
>> - a new object is always inserted after the last object with a
>> smaller address, or at the start.
>> - when rhashtable_walk_start() is called, it records that 'p' is not
>> 'safe', meaning that it cannot be dereferenced. The revalidation
>> that was previously done here is moved to rhashtable_walk_next()
>> - when rhashtable_walk_next() is called while p is not NULL and not
>> safe, it walks the chain looking for the first object with an
>> address greater than p and returns that. If there is none, it moves
>> to the next hash chain.
>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>> This is a resend of a patch that I sent back in July. I couldn't
>> applied then because it assumed another rhashtable patch which hadn't
>> landed yet - it now has.
> I thought we had agreed to drop this because nobody needs it
> currently and it doesn't handle rhlist?
I think it was agreed that I would not pursue features that were only
of use to out-of-tree code, but I don't think that applies here. This
is not a feature, this is a quality-of-implementation improvement.
There are users in the kernel today which use
to drop out of RCU protection for periods during the walk.
Any such user might miss seeing an object that has been in the table
for a while - sure that is less than optimal, and should be fixed if
the cost is small.
There are currently no rhlist users which use stop/start to drop out
of RCU, so there is no clear value in fixing that case, or cost in not
> Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Description: PGP signature