Re: [PATCH RFC 3/3] nvme: delay failover by command quiesce timeout
From: Sagi Grimberg
Date: Tue Apr 15 2025 - 17:07:29 EST
On 15/04/2025 15:11, Daniel Wagner wrote:
On Tue, Apr 15, 2025 at 01:28:15AM +0300, Sagi Grimberg wrote:
+void nvme_schedule_failover(struct nvme_ctrl *ctrl)
+{
+ unsigned long delay;
+
+ if (ctrl->cqt)
+ delay = msecs_to_jiffies(ctrl->cqt);
+ else
+ delay = ctrl->kato * HZ;
I thought that delay = m * ctrl->kato + ctrl->cqt
where m = ctrl->ctratt & NVME_CTRL_ATTR_TBKAS ? 3 : 2
no?
This was said before, but if we are going to always start waiting for kato
for failover purposes,
we first need a patch that prevent kato from being arbitrarily long.
That should be addressed with the cross controller reset (CCR).
CCR is a better solution as it is explicit, and faster.
The KATO*n
+ CQT is the upper limit for the target recovery. As soon we have CCR,
the recovery delay is reduced to the time the CCR exchange takes.
What I meant was that the user can no longer set kato to be arbitrarily
long when we
now introduce failover dependency on it.
We need to set a sane maximum value that will failover in a reasonable
timeframe.
In other words, kato cannot be allowed to be set by the user to 60
minutes. While we didn't
care about it before, now it means that failover may take 60+ minutes.
Hence, my request to set kato to a max absolute value of seconds. My
vote was 10 (2x of the default),
but we can also go with 30.