[RFC PATCH 1/1] nvme-pci: detect I/O queue depth changes after reset

From: guzebing

Date: Wed May 27 2026 - 03:54:06 EST


From: Guzebing <guzebing@xxxxxxxxxxxxx>

Firmware activation may change the controller queue depth reported
through CAP.MQES. In the nvme-pci reset path, nvme_pci_enable()
rereads CAP and updates dev->q_depth, while existing struct nvme_queue
entries keep the old q_depth and SQ/CQ DMA addresses.

If the new depth is smaller than the existing nvmeq depth, reset recovery
would try to create I/O queues with a depth the controller no longer
accepts. Detect this before recreating I/O queues and fail the reset with
an explicit error; without this, the failure shows up later as lost I/O
queues and namespace removal.

If the new depth is larger, warn and continue with the existing queue
resources. The larger depth will not be used until the controller is
removed and probed again.

Signed-off-by: Guzebing <guzebing@xxxxxxxxxxxxx>
---
drivers/nvme/host/pci.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index db5fc9bf66272..4bc112f8a096e 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3155,6 +3155,33 @@ static bool nvme_pci_update_nr_queues(struct nvme_dev *dev)
return true;
}

+static int nvme_pci_check_reset_queue_depth(struct nvme_dev *dev)
+{
+ u32 nvmeq_q_depth;
+ u32 dev_q_depth = dev->q_depth;
+
+ if (dev->ctrl.queue_count <= 1)
+ return 0;
+
+ nvmeq_q_depth = dev->queues[1].q_depth;
+ if (nvmeq_q_depth == dev_q_depth)
+ return 0;
+
+ if (nvmeq_q_depth > dev_q_depth) {
+ dev_err(dev->ctrl.device,
+ "IO queue depth decreased after reset (%u -> %u); "
+ "live reset recovery is unsupported\n",
+ nvmeq_q_depth, dev_q_depth);
+ return -EIO;
+ }
+
+ dev_warn(dev->ctrl.device,
+ "IO queue depth increased after reset (%u -> %u); "
+ "remove and probe the controller again to use the new depth\n",
+ nvmeq_q_depth, dev_q_depth);
+ return 0;
+}
+
static int nvme_pci_enable(struct nvme_dev *dev)
{
int result = -ENOMEM;
@@ -3371,6 +3398,9 @@ static void nvme_reset_work(struct work_struct *work)

mutex_lock(&dev->shutdown_lock);
result = nvme_pci_enable(dev);
+ if (result)
+ goto out_unlock;
+ result = nvme_pci_check_reset_queue_depth(dev);
if (result)
goto out_unlock;
nvme_unquiesce_admin_queue(&dev->ctrl);
--
2.20.1