Re: [PATCH v2] nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
From: Cedric Blancher
Date: Sat Apr 06 2024 - 02:07:49 EST
On Fri, 5 Apr 2024 at 20:07, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
> On Fri, Apr 05, 2024 at 01:56:18PM -0400, Jeff Layton wrote:
> > Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
> > client. While a callback job is technically an RPC that counter is
> > really more for client-driven RPCs, and this has the effect of
> > preventing the client from being unhashed until the callback completes.
> >
> > If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
> > can end up in a situation where the callback can't complete on the (now
> > dead) callback channel, but the new client can't connect because the old
> > client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
> > return on the CREATE_SESSION operation.
> >
> > The job is only holding a reference to the client so it can clear a flag
> > in the after the RPC completes. Fix this by having CB_RECALL_ANY instead
> > hold a reference to the cl_nfsdfs.cl_ref. Typically we only take that
> > sort of reference when dealing with the nfsdfs info files, but it should
> > work appropriately here to ensure that the nfs4_client doesn't
> > disappear.
> >
> > Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
> > Reported-by: Vladimir Benes <vbenes@xxxxxxxxxx>
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
>
> Applied to nfsd-fixes while waiting for review and testing. Thanks!
Please add this to the 6.6 LTS brach, too
Ced
--
Cedric Blancher <cedric.blancher@xxxxxxxxx>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur