Re: [PATCH] sunrpc: Add task's xid to 'not responding' messages on call_timeout

From: Trond Myklebust
Date: Fri Feb 09 2018 - 20:42:08 EST


On Fri, 2018-02-09 at 23:06 -0200, Thiago Rafael Becker wrote:
> When investigating reasons for nfs failures, packet dumps arei
> eventually used.
> Finding the rpc that generated the failure is done by comparing all
> sent
> rpc calls and all received rpc replies for those which are
> unanswered,
> which is prone to errors like
> - Slow server responses
> - Incomplete and uncaptured packets in the packet dump
> - The heuristics used to inspect packets failing to interpret one
>
> This patch adds the xid of rpc_tasks to the 'not responding' messages
> in call_timeout to make these analysis more precise.
>
> Signed-off-by: Thiago Rafael Becker <thiago.becker@xxxxxxxxx>
> ---
> net/sunrpc/clnt.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index e2a4184f3c5d..83c8aca951f4 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -2214,9 +2214,10 @@ call_timeout(struct rpc_task *task)
> }
> if (RPC_IS_SOFT(task)) {
> if (clnt->cl_chatty) {
> - printk(KERN_NOTICE "%s: server %s not
> responding, timed out\n",
> + printk(KERN_NOTICE "%s: server %s not
> responding, timed out (xid: %x)\n",
> clnt->cl_program->name,
> - task->tk_xprt->servername);
> + task->tk_xprt->servername,
> + be32_to_cpu(task->tk_rqstp-
> >rq_xid));
> }
> if (task->tk_flags & RPC_TASK_TIMEOUT)
> rpc_exit(task, -ETIMEDOUT);
> @@ -2228,9 +2229,10 @@ call_timeout(struct rpc_task *task)
> if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) {
> task->tk_flags |= RPC_CALL_MAJORSEEN;
> if (clnt->cl_chatty) {
> - printk(KERN_NOTICE "%s: server %s not
> responding, still trying\n",
> + printk(KERN_NOTICE "%s: server %s not
> responding, still trying (xid: %x)\n",
> clnt->cl_program->name,
> - task->tk_xprt->servername);
> + task->tk_xprt->servername,
> + be32_to_cpu(task->tk_rqstp->rq_xid));
> }
> }
> rpc_force_rebind(clnt);

NACK. We should not be logging internal information such as XIDs as
KERN_NOTICE messages. If you want this information, you can extract it
yourself; there are already plenty of ways to do so as a privileged
user.

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@xxxxxxxxxxxxxxx