Re: [PATCH RFC 3/3] nvme: delay failover by command quiesce timeout

From: Sagi Grimberg
Date: Tue Apr 15 2025 - 19:35:13 EST



What I meant was that the user can no longer set kato to be arbitrarily
long when we
now introduce failover dependency on it.

We need to set a sane maximum value that will failover in a reasonable
timeframe.
In other words, kato cannot be allowed to be set by the user to 60
minutes. While we didn't
care about it before, now it means that failover may take 60+ minutes.

Hence, my request to set kato to a max absolute value of seconds. My
vote was 10 (2x of the default),
but we can also go with 30.
Adding a maximum value for KATO makes a lot of sense to me. This will
help keep us away from a hung task timeout when the full delay is
taken into account. 30 makes sense to me from the perspective that
the maximum should be long enough to handle non-ideal situations
functionally, but not a value that you expect people to use regularly.

I think CQT should have a maximum allowed value for similar reasons.
If we do clamp down on the CQT, we could be opening ourselves to the
target not completely cleaning up, but it keeps us from a hung task
timeout, and _any_ delay will help most of the time.

CQT comes from the controller, and if it is high, it effectively means that the
controller cannot handle faster failover reliably. So I think we should leave it
as is. It is the vendor problem.


CCR will not address arbitrarily long times for either because:
1. It is optional.
2. It may fail.
3. We still need a ceiling on the recovery time we can handle.

Yes, makes sense. if it fails, we need to wait until something expires, which would
be CQT.