Re: [PATCH 1/3] nfsd: add more info to WARN_ON_ONCE on failed callbacks
From: Jeff Layton
Date: Mon Sep 09 2024 - 13:40:53 EST
On Mon, 2024-09-09 at 13:23 -0400, Olga Kornievskaia wrote:
> On Mon, Aug 26, 2024 at 8:54 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> >
> > Currently, you get the warning and stack trace, but nothing is printed
> > about the relevant error codes. Add that in.
> >
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> > fs/nfsd/nfs4callback.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index d756f443fc44..dee9477cc5b5 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1333,7 +1333,8 @@ static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
> > return;
> >
> > if (cb->cb_status) {
> > - WARN_ON_ONCE(task->tk_status);
> > + WARN_ONCE(task->tk_status, "cb_status=%d tk_status=%d",
> > + cb->cb_status, task->tk_status);
> > task->tk_status = cb->cb_status;
> > }
>
> Educational question: why is this warning there in the first place? I
> can appreciate the value of information. Does knfsd expect that a
> callback should never fail with an error and thus tries to always
> catch it? A tracepoint can log an rpc tasks status but I realize that
> it doesn't capture attention like a WARN_ON.
>
> I have a report with this warn_on which I'm trying to figure out why
> is happening but I was surprised to find nfsd cares so much about the
> callback status.
>
This indicates that we had to reissue the RPC, and it got back a second
error. The stack trace is not terribly helpful, IMO. I personally
tripped this while working on the delstid patches, because I had some
bugs in that series initially.
Chuck and I have discussed that the callback channel really needs full
code audit and (probably a real overhaul). The code is just not as
robust as it ought to be, IMO. I've no objection to ripping this
warning out, but it does indicate that the callback "engine" is in a
situation that it may not handle well.
--
Jeff Layton <jlayton@xxxxxxxxxx>