[PATCH 2/2] sunrpc: don't immediately retransmit on seqno miss
From: Nikhil Jha
Date: Mon Mar 10 2025 - 11:12:22 EST
RFC2203 requires that retransmitted messages use a new gss sequence
number, but the same XID. This means that if the server is just slow
(e.x. overloaded), the client might receive a response using an older
seqno than the one it has recorded.
Currently, Linux's client immediately retransmits in this case. However,
this could lead to a lot of wasted retransmits until the server
eventually responds faster than the client.
Client -> SEQ 1 -> Server
Client -> SEQ 2 -> Server
Client <- SEQ 1 <- Server (misses, expecting seqno = 2)
Client -> SEQ 3 -> Server (immediate retransmission on miss)
Client <- SEQ 2 <- Server (misses, expecting seqno = 3)
Client -> SEQ 4 -> Server (immediate retransmission on miss)
... and so on ...
This commit makes it so that we ignore messages with bad checksums
due to seqnum mismatch, and rely on the usual timeout behavior for
retransmission instead of doing so immediately.
Signed-off-by: Nikhil Jha <njha@xxxxxxxxxxxxxx>
---
net/sunrpc/clnt.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 2fe88ea79a70..74bbd050e6d9 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -2760,8 +2760,13 @@ rpc_decode_header(struct rpc_task *task, struct xdr_stream *xdr)
case -EPROTONOSUPPORT:
goto out_err;
case -EACCES:
- /* Re-encode with a fresh cred */
- fallthrough;
+ /* probable RPCSEC_GSS out-of-sequence event (RFC2203),
+ * reset recv state and keep waiting, don't retransmit
+ */
+ task->tk_rqstp->rq_reply_bytes_recvd = 0;
+ task->tk_status = xprt_request_enqueue_receive(task);
+ task->tk_action = call_transmit_status;
+ return -EBADMSG;
default:
goto out_garbage;
}
--
2.39.3