[PATCH v2] nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
From: Jeff Layton
Date: Fri Apr 05 2024 - 13:56:43 EST
Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
client. While a callback job is technically an RPC that counter is
really more for client-driven RPCs, and this has the effect of
preventing the client from being unhashed until the callback completes.
If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
can end up in a situation where the callback can't complete on the (now
dead) callback channel, but the new client can't connect because the old
client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
return on the CREATE_SESSION operation.
The job is only holding a reference to the client so it can clear a flag
in the after the RPC completes. Fix this by having CB_RECALL_ANY instead
hold a reference to the cl_nfsdfs.cl_ref. Typically we only take that
sort of reference when dealing with the nfsdfs info files, but it should
work appropriately here to ensure that the nfs4_client doesn't
disappear.
Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
Reported-by: Vladimir Benes <vbenes@xxxxxxxxxx>
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
Changes in v2:
- Clean up the changelog
- Add Fixes: tag
- Use kref_get instead of kref_get_unless_zero
---
fs/nfsd/nfs4state.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5fcd93f7cb8c..3cef81e196c6 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3042,12 +3042,9 @@ static void
nfsd4_cb_recall_any_release(struct nfsd4_callback *cb)
{
struct nfs4_client *clp = cb->cb_clp;
- struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
- spin_lock(&nn->client_lock);
clear_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
- put_client_renew_locked(clp);
- spin_unlock(&nn->client_lock);
+ drop_client(clp);
}
static int
@@ -6616,7 +6613,7 @@ deleg_reaper(struct nfsd_net *nn)
list_add(&clp->cl_ra_cblist, &cblist);
/* release in nfsd4_cb_recall_any_release */
- atomic_inc(&clp->cl_rpc_users);
+ kref_get(&clp->cl_nfsdfs.cl_ref);
set_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
clp->cl_ra_time = ktime_get_boottime_seconds();
}
---
base-commit: 05258a0a69b3c5d2c003f818702c0a52b6fea861
change-id: 20240405-rhel-31513-028ab6f14252
Best regards,
--
Jeff Layton <jlayton@xxxxxxxxxx>