Re: [PATCH 2/3] rhashtable: don't hold lock on first table throughout insertion.
From: Herbert Xu
Date: Fri Mar 15 2019 - 01:47:29 EST
On Thu, Mar 14, 2019 at 04:05:28PM +1100, NeilBrown wrote:
> rhashtable_try_insert() currently holds a lock on the bucket in
> the first table, while also locking buckets in subsequent tables.
> This is unnecessary and looks like a hold-over from some earlier
> version of the implementation.
>
> As insert and remove always lock a bucket in each table in turn, and
> as insert only inserts in the final table, there cannot be any races
> that are not covered by simply locking a bucket in each table in turn.
>
> When an insert call reaches that last table it can be sure that there
> is no matchinf entry in any other table as it has searched them all, and
> insertion never happens anywhere but in the last table. The fact that
> code tests for the existence of future_tbl while holding a lock on
> the relevant bucket ensures that two threads inserting the same key
> will make compatible decisions about which is the "last" table.
>
> This simplifies the code and allows the ->rehash field to be
> discarded.
>
> We still need a way to ensure that a dead bucket_table is never
> re-linked by rhashtable_walk_stop(). This can be achieved by calling
> call_rcu() inside the locked region, and checking with
> rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the
> bucket table is empty and dead.
>
> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
> ---
> include/linux/rhashtable.h | 13 -----------
> lib/rhashtable.c | 50 +++++++++++++-------------------------------
> 2 files changed, 15 insertions(+), 48 deletions(-)
Thanks! This looks very nice.
> @@ -580,36 +583,14 @@ static void *rhashtable_try_insert(struct rhashtable *ht, const void *key,
> struct bucket_table *new_tbl;
> struct bucket_table *tbl;
> unsigned int hash;
> - spinlock_t *lock;
> void *data;
>
> - tbl = rcu_dereference(ht->tbl);
> -
> - /* All insertions must grab the oldest table containing
> - * the hashed bucket that is yet to be rehashed.
> - */
> - for (;;) {
> - hash = rht_head_hashfn(ht, tbl, obj, ht->p);
> - lock = rht_bucket_lock(tbl, hash);
> - spin_lock_bh(lock);
> -
> - if (tbl->rehash <= hash)
> - break;
> -
> - spin_unlock_bh(lock);
> - tbl = rht_dereference_rcu(tbl->future_tbl, ht);
> - }
> -
> - data = rhashtable_lookup_one(ht, tbl, hash, key, obj);
> - new_tbl = rhashtable_insert_one(ht, tbl, hash, obj, data);
> - if (PTR_ERR(new_tbl) != -EEXIST)
> - data = ERR_CAST(new_tbl);
> + new_tbl = rcu_dereference(ht->tbl);
>
> - while (!IS_ERR_OR_NULL(new_tbl)) {
> + do {
> tbl = new_tbl;
> hash = rht_head_hashfn(ht, tbl, obj, ht->p);
> - spin_lock_nested(rht_bucket_lock(tbl, hash),
> - SINGLE_DEPTH_NESTING);
> + spin_lock(rht_bucket_lock(tbl, hash));
One small problem. I think this needs to be spin_lock_bh.
Cheers,
--
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt