Re: [PATCH v5] nvme: Skip trace complete_rq on host path error

Next message: Xin Long: "Re: [PATCH net-next v5] net: pppoe: implement GRO/GSO support"
Previous message: Felix Maurer: "Re: [PATCH net-next v1 2/4] net: hsr: fix VLAN add unwind on slave errors"
In reply to: hch@xxxxxx: "Re: [PATCH v5] nvme: Skip trace complete_rq on host path error"
Next in thread: &#xC804;&#xBBFC;&#xC2DD;: "Re: [PATCH v5] nvme: Skip trace complete_rq on host path error"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Keith Busch

Date: Thu Mar 26 2026 - 10:53:02 EST

On Thu, Mar 26, 2026 at 03:31:24PM +0100, hch@xxxxxx wrote:
> On Thu, Mar 26, 2026 at 08:28:46AM -0600, Keith Busch wrote:
> > On Thu, Mar 26, 2026 at 03:51:52PM +0900, 전민식 wrote:
> > > {
> > > struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
> > >
> > > - trace_nvme_complete_rq(req);
> > > + /*
> > > + * The idea for these trace events was to match up commands
> > > + * dispatched to hardware with the hardware's posted response.
> > > + * So skip tracing for undispatched commands.
> > > + */
> > > + if (nvme_req(req)->status != NVME_SC_HOST_PATH_ERROR)
> > > + trace_nvme_complete_rq(req);
> > > +
> >
> > Well, how do we know a controller doesnn't actually return that status
> > code? I was just suggesting to skip the trace for the condition we never
> > dispatched the command. An added bonus is we don't need a mostly
> > unnecessary 'if' check on every IO.
>
> The 7?h error values were added for host use. The description of
> the section in the spec suggests this, but isn't actually as clear
> as I would like it. I wonder if we need to verify that controller
> don't incorrectly return it, as that could cause some problems?

Yeah, the spec is not very clear on their use. It just defines the codes
and a one sentence description, and then never mentioned again. I think
it allows the possibility for the controller to internally complete a
command with such status from some undefined OOB mechanism. I'm not sure
why the host needs to have their own self-generated status codes anyway
since it can already attach whatever arbitrary flags it wants to its
private data, like how we use NVME_REQ_CANCELLED.

Anyway if a controller did return it for whatever reason, even if it is
a bug, we'd want to see it in the trace.