Re: [PATCH v3 2/3] nvme: trigger reset when keep alive fails

From: Sagi Grimberg
Date: Wed Jan 08 2025 - 05:51:01 EST





On 07/01/2025 16:38, Daniel Wagner wrote:
On Tue, Dec 24, 2024 at 12:31:35PM +0200, Sagi Grimberg wrote:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index bfd71511c85f8b1a9508c6ea062475ff51bf27fe..2a07c2c540b26c8cbe886711abaf6f0afbe6c4df 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1320,6 +1320,12 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,
dev_err(ctrl->device,
"failed nvme_keep_alive_end_io error=%d\n",
status);
+ /*
+ * The driver reports that we lost the connection,
+ * trigger a recovery.
+ */
+ if (status == BLK_STS_TRANSPORT)
+ nvme_reset_ctrl(ctrl);
return RQ_END_IO_NONE;
}

A lengthy explanation that results in nvme core behavior that assumes a very
specific driver behavior.
I tried to explain exactly what's going on, so we can discuss possible
solutions without communicating past each other.

In the meantime I started on a patch set for the TP4129 related changes
in the spec (KATO Corrections and Clarifications). These changes would
also depend on the kato timeout handler triggering a reset.

I am fine with dropping this change for now and discuss it in the light
of TP4129 if this is what you prefer?

Isn't the root of the problem that FC is willing to live
peacefully with a controller
without any queues/connectivity to it without periodically reconnecting?
The root problem is that the connect lost event gets ignored in the
CONNECTING state for the first connection attempt. All will work fine
for RECONNECTING state.

Maybe something like this instead? (untested)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index c4cbe3ce81f7..1f1d1d62a978 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -148,6 +148,7 @@ struct nvme_fc_rport {
#define ASSOC_ACTIVE 0
#define ASSOC_FAILED 1
#define FCCTRL_TERMIO 2
+#define CONNECTIVITY_LOST 3

struct nvme_fc_ctrl {
spinlock_t lock;
@@ -785,6 +786,8 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl)
"NVME-FC{%d}: controller connectivity lost. Awaiting "
"Reconnect", ctrl->cnum);

+ set_bit(CONNECTIVITY_LOST, &ctrl->flags);
+
switch (nvme_ctrl_state(&ctrl->ctrl)) {
case NVME_CTRL_NEW:
case NVME_CTRL_LIVE:
@@ -3071,6 +3074,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
if (nvme_fc_ctlr_active_on_rport(ctrl))
return -ENOTUNIQ;

+ clear_bit(CONNECTIVITY_LOST, &ctrl->flags);
+
dev_info(ctrl->ctrl.device,
"NVME-FC{%d}: create association : host wwpn 0x%016llx "
" rport wwpn 0x%016llx: NQN \"%s\"\n",
@@ -3174,6 +3179,11 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)

changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);

+ if (test_bit(CONNECTIVITY_LOST, &ctrl->flags)) {
+ ret = -EIO;
+ goto out_term_aeo_ops;
+ }
+
ctrl->ctrl.nr_reconnects = 0;

if (changed)

This looks a lot better to me.