Re: [PATCH] rhashtable: add likely() to __rht_ptr()

From: Menglong Dong
Date: Wed Sep 24 2025 - 09:31:20 EST


On Tue, Sep 23, 2025 at 7:31 PM NeilBrown <neilb@xxxxxxxxxxx> wrote:
>
> On Tue, 23 Sep 2025, Menglong Dong wrote:
> > On Tue, Sep 23, 2025 at 2:36 PM Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Menglong Dong <menglong8.dong@xxxxxxxxx> wrote:
> > > > In the fast path, the value of "p" in __rht_ptr() should be valid.
> > > > Therefore, wrap it with a "likely". The performance increasing is tiny,
> > > > but it's still worth to do it.
> > > >
> > > > Signed-off-by: Menglong Dong <dongml2@xxxxxxxxxxxxxxx>
> > > > ---
> > > > include/linux/rhashtable.h | 5 +++--
> > > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > It's not obvious that rht_ptr would be non-NULL. It depends on the
> > > work load. For example, if you're doing a lookup where most keys
> > > are non-existent then it would most likely be NULL.
> >
> > Yeah, I see. In my case, the usage of the rhashtable will be:
> > add -> lookup, and rht_ptr is alway non-NULL. You are right,
> > it can be NULL in other situations, and it's not a good idea to
> > use likely() here ;)
>
> Have you measured a performance increase? How tiny is it?
>
> It might conceivably make sense to have a rhashtable_lookup_likely() and
> rhashtable_lookup_unlikely(), but concrete evidence of the benefit would
> be needed.

I made a more accurate bench testing: call the rhashtable_lookup()
100000000 times.

Without the likely(), it cost 123697645ns. And with the likely(), only
84507668ns.

I add the likely() not only to the __rht_ptr(), but also rht_for_each_rcu_from()
and rhashtable_lookup().

Below is the part code of the testing:

for (i = 0; i < num_elems; i++) {
objs[i] = kmalloc(sizeof(**objs), GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, objs[i]);
objs[i]->key = i;
INIT_RHT_NULLS_HEAD(objs[i]->node.next);
ret = rhashtable_insert_fast(&ht, &objs[i]->node, bench_params);
KUNIT_ASSERT_EQ(test, ret, 0);
}

/* for CPU warm up */
for (i = 0; i < 1000000000; i++) {
u32 key = 0;
struct bench_obj *found;

found = rhashtable_lookup(&ht, &key, bench_params);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, found);
KUNIT_ASSERT_EQ(test, found->key, key);
}

rcu_read_lock();
t0 = ktime_get();
for (i = 0; i < 100000000; i++) {
u32 key = 0;
struct bench_obj *found;

found = rhashtable_lookup(&ht, &key, bench_params);
if (unlikely(!found)) {
pr_info("error!\n");
break;
}
}
t1 = ktime_get();
rcu_read_unlock();

>
> Thanks,
> NeilBrown