[PATCH 01/10] nfsd: fix BUG_ON in nfsd4_alloc_layout_stateid on racing delegation revoke
From: Jeff Layton
Date: Thu May 28 2026 - 17:59:42 EST
nfsd4_alloc_layout_stateid reads fp->fi_deleg_file without holding
fi_lock when the parent stateid is a delegation. A concurrent delegation
revoke via the laundromat can clear fi_deleg_file under fi_lock, causing
nfsd_file_get() to return NULL and triggering the BUG_ON.
This race is client-reachable: two NFS clients can trigger it by having
one hold a delegation while another opens the same file to force a
recall. When the first client doesn't respond to the recall, the
laundromat revokes it. A concurrent LAYOUTGET from any client using the
delegation stateid hits the race window.
Fix this by taking fi_lock around the fi_deleg_file read in the
SC_TYPE_DELEG path, and replacing the BUG_ON with a graceful error
return that cleans up the partially-initialized layout stateid.
Fixes: c5c707f96fc9 ("nfsd: implement pNFS layout recalls")
Assisted-by: kres:claude-opus-4-7
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
fs/nfsd/nfs4layouts.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 9ed2e3d65062..5d48c7673fa1 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -247,11 +247,17 @@ nfsd4_alloc_layout_stateid(struct nfsd4_compound_state *cstate,
nfsd4_init_cb(&ls->ls_recall, clp, &nfsd4_cb_layout_ops,
NFSPROC4_CLNT_CB_LAYOUT);
- if (parent->sc_type == SC_TYPE_DELEG)
+ if (parent->sc_type == SC_TYPE_DELEG) {
+ spin_lock(&fp->fi_lock);
ls->ls_file = nfsd_file_get(rcu_dereference_protected(fp->fi_deleg_file, 1));
- else
+ spin_unlock(&fp->fi_lock);
+ } else {
ls->ls_file = find_any_file(fp);
- BUG_ON(!ls->ls_file);
+ }
+ if (!ls->ls_file) {
+ nfs4_put_stid(stp);
+ return NULL;
+ }
ls->ls_fenced = false;
ls->ls_fence_delay = 0;
--
2.54.0