Re: [PATCH - revised] rhashtable: detect when object movement might have invalidated a lookup

From: NeilBrown
Date: Fri Jul 20 2018 - 02:30:46 EST


On Thu, Jul 19 2018, David Miller wrote:

> From: NeilBrown <neilb@xxxxxxxx>
> Date: Mon, 16 Jul 2018 09:57:11 +1000
>
>> Some users of rhashtable might need to change the key
>> of an object and move it to a different location in the table.
>> Other users might want to allocate objects using
>> SLAB_TYPESAFE_BY_RCU which can result in the same memory allocation
>> being used for a different (type-compatible) purpose and similarly
>> end up in a different hash-chain.
>>
>> To support these, we store a unique NULLS_MARKER at the end of
>> each chain, and when a search fails to find a match, we check
>> if the NULLS marker found was the expected one. If not,
>> the search is repeated.
>>
>> The unique NULLS_MARKER is derived from the address of the
>> head of the chain.
>>
>> If an object is removed and re-added to the same hash chain, we won't
>> notice by looking that the NULLS marker. In this case we must be sure
>> that it was not re-added *after* its original location, or a lookup may
>> incorrectly fail. The easiest solution is to ensure it is inserted at
>> the start of the chain. insert_slow() already does that,
>> insert_fast() does not. So this patch changes insert_fast to always
>> insert at the head of the chain.
>>
>> Note that such a user must do their own double-checking of
>> the object found by rhashtable_lookup_fast() after ensuring
>> mutual exclusion which anything that might change the key, such as
>> successfully taking a new reference.
>>
>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>
> Neil I have to be honest with you.

Thank you.

>
> During this whole ordeal I was under the impression that this was all
> going to be used for something in-tree. But now I see that you want
> to use all of this stuff for lustre which is out of tree.
>
> It would be extremely hard for me to accept adding this kind of
> complexity and weird semantics to an already extremely complicated
> and delicate piece of infrastructure if something in-tree would use
> it.
>
> But for something out-of-tree? I'm sorry, no way.

That's unfortunate, but I can live with it. null-list support is
just a nice-to-have for me.
I'll resend the patch with the unwanted complexity removed.

Does this ruling also apply to the bit-spin-lock changes and the
per-cpu-counter changes that I have proposed?
These improve scalability when updates dominate. Not having these
in mainline would mean I need to carry a separate rhashtables
implementation for lustre, which means code diversion which isn't
healthy in the long run.

(Note that, in my mind, lustre is only temporarily out-of-tree. It is
coming back, hopefully this year).

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature