Re: [PATCH] sunrpc: use kref_get_unless_zero in auth_domain_lookup
From: Chuck Lever
Date: Wed May 20 2026 - 15:49:19 EST
On Wed, May 20, 2026, at 2:10 PM, Jeff Layton wrote:
> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.
>
> auth_domain_lookup() walks the hash under auth_domain_lock but uses plain
> kref_get() to acquire a reference. If it finds an entry in this transient
> zero-refcount state, refcount_inc() triggers a WARN and refuses to
> increment (saturating refcount_t semantics), but the function returns the
> pointer anyway. The caller then holds a dangling reference: when the
> concurrent auth_domain_put() finally acquires the lock and runs
> auth_domain_release(), the object is freed while the lookup caller still
> has a pointer to it.
>
> The sibling function auth_domain_find() already handles this correctly
> using kref_get_unless_zero(). Apply the same pattern in
> auth_domain_lookup(): treat a zero-refcount entry as absent and continue
> searching. The loop then either finds another live entry or falls through
> to insert the new domain, preserving existing semantics.
>
> Reported-by: Chris Mason <clm@xxxxxxxx>
> Assisted-by: kres:claude-opus-4-6
> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> ---
> net/sunrpc/svcauth.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }
>
> ---
> base-commit: 508c9eaa7e0b952c4fe019880796e6207e3cd201
> change-id: 20260520-nfsd-fixes-f137572d0480
>
> Best regards,
> --
> Jeff Layton <jlayton@xxxxxxxxxx>
Forwarding sashiko review results:
> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.
[Severity: High]
Is this description of kref_put_lock() accurate?
Looking at kref_put_lock(), it uses refcount_dec_and_lock() which acquires
the spinlock before the 1->0 transition is completed. Since
auth_domain_lookup() holds the same auth_domain_lock, is it actually possible
for it to encounter a 0-refcount entry under normal conditions?
> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }
[Severity: High]
If a 0-refcount entry is encountered here, wouldn't it indicate a severe
underlying refcount imbalance or use-after-free rather than a normal race?
By using kref_get_unless_zero() and continuing the search, might this silently
mask the root cause of the WARN instead of resolving it?
Furthermore, if the loop skips the 0-refcount entry and falls through to the
end of the function, won't it insert a duplicate auth_domain with the same
name into the hash list?
--
Chuck Lever