Re: [PATCH v5 3/6] nvme-tcp: short-circuit reconnect retries

From: Sagi Grimberg
Date: Tue Apr 09 2024 - 16:21:11 EST




On 09/04/2024 12:35, Daniel Wagner wrote:
From: Hannes Reinecke <hare@xxxxxxx>

Returning an nvme status from nvme_tcp_setup_ctrl() indicates that the
association was established and we have received a status from the
controller; consequently we should honour the DNR bit. If not any future
reconnect attempts will just return the same error, so we can
short-circuit the reconnect attempts and fail the connection directly.

Signed-off-by: Hannes Reinecke <hare@xxxxxxx>
[dwagner: add helper to decide to reconnect]
Signed-off-by: Daniel Wagner <dwagner@xxxxxxx>
---
drivers/nvme/host/nvme.h | 24 ++++++++++++++++++++++++
drivers/nvme/host/tcp.c | 23 +++++++++++++++--------
2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9b8904a476b8..dfe103283a3d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -701,6 +701,30 @@ static inline bool nvme_is_path_error(u16 status)
return (status & 0x700) == 0x300;
}
+/*
+ * Evaluate the status information returned by the LLDD in order to
+ * decided if a reconnect attempt should be scheduled.
+ *
+ * There are two cases where no reconnect attempt should be attempted:
+ *
+ * 1) The LLDD reports an negative status. There was an error (e.g. no
+ * memory) on the host side and thus abort the operation.
+ * Note, there are exception such as ENOTCONN which is
+ * not an internal driver error, thus we filter these errors
+ * out and retry later.
+ * 2) The DNR bit is set and the specification states no further
+ * connect attempts with the same set of paramenters should be
+ * attempted.
+ */
+static inline bool nvme_ctrl_reconnect(int status)
+{
+ if (status < 0 && status != -ENOTCONN)
+ return false;

So if the host failed to allocate a buffer it will never attempt
another reconnect? doesn't sound right to me..,