Re: [PATCH] sunrpc: use kref_get_unless_zero in auth_domain_lookup

From: Chuck Lever

Date: Wed May 20 2026 - 15:49:19 EST




On Wed, May 20, 2026, at 2:10 PM, Jeff Layton wrote:
> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.
>
> auth_domain_lookup() walks the hash under auth_domain_lock but uses plain
> kref_get() to acquire a reference. If it finds an entry in this transient
> zero-refcount state, refcount_inc() triggers a WARN and refuses to
> increment (saturating refcount_t semantics), but the function returns the
> pointer anyway. The caller then holds a dangling reference: when the
> concurrent auth_domain_put() finally acquires the lock and runs
> auth_domain_release(), the object is freed while the lookup caller still
> has a pointer to it.
>
> The sibling function auth_domain_find() already handles this correctly
> using kref_get_unless_zero(). Apply the same pattern in
> auth_domain_lookup(): treat a zero-refcount entry as absent and continue
> searching. The loop then either finds another live entry or falls through
> to insert the new domain, preserving existing semantics.
>
> Reported-by: Chris Mason <clm@xxxxxxxx>
> Assisted-by: kres:claude-opus-4-6
> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> ---
> net/sunrpc/svcauth.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }
>
> ---
> base-commit: 508c9eaa7e0b952c4fe019880796e6207e3cd201
> change-id: 20260520-nfsd-fixes-f137572d0480
>
> Best regards,
> --
> Jeff Layton <jlayton@xxxxxxxxxx>

Forwarding sashiko review results:

> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.

[Severity: High]
Is this description of kref_put_lock() accurate?

Looking at kref_put_lock(), it uses refcount_dec_and_lock() which acquires
the spinlock before the 1->0 transition is completed. Since
auth_domain_lookup() holds the same auth_domain_lock, is it actually possible
for it to encounter a 0-refcount entry under normal conditions?

> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }

[Severity: High]
If a 0-refcount entry is encountered here, wouldn't it indicate a severe
underlying refcount imbalance or use-after-free rather than a normal race?

By using kref_get_unless_zero() and continuing the search, might this silently
mask the root cause of the WARN instead of resolving it?

Furthermore, if the loop skips the 0-refcount entry and falls through to the
end of the function, won't it insert a duplicate auth_domain with the same
name into the hash list?


--
Chuck Lever