Re: [PATCH v3 18/21] nvme: Update CCR completion wait timeout to consider CQT
From: Hannes Reinecke
Date: Mon Mar 02 2026 - 02:35:46 EST
On 2/27/26 04:05, Randy Jennings wrote:
On Thu, Feb 19, 2026 at 11:25 PM Hannes Reinecke <hare@xxxxxxx> wrote:Sure. But currently we don't have a policy for this; for us the
I see your point. It could take ~KATO time for the AEN to show up after
On 2/20/26 03:01, Randy Jennings wrote:
Hannes,Because we have to wait for the AEN, at which point KATO comes into
(ctrl->kato * 1000) + ctrl->cqtAs Mohamed pointed out, we have already received a response from a CCR
command. The CCR, once accepted, communicates the death of the
connection to the impacted controller and starts the cleanup tracked
by CQT. So, no need to wait for the impacted controller to figure out
the connection is down.
The max(cqt, kato) was just to give some wait time that should allow
issuing a CCR again from a different controller (in case of losing
communication with this one). It certainly does not need to be longer
than cqt (and it should be no longer than the remaining duration of
time-based retry; that should get addressed at some point). I cannot
remember why kato (if larger; I expect it would be smaller) made sense
at the time.
play yet again.
So max(CQT, KATO) is the appropriate waiting time for that.
the CCR operation finishes. Technically true. However, if responses
are taking KATO time to get back to the host, I think would rather retry
on a more healthy link.
AEN is just a normal completion, for which we have to wait until
the KATO interval is exhausted.
We really should have a session or BOF about CCR handling at LSF.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich