Re: [PATCH] nvme: reject completions for requests that are not in flight

From: Christoph Hellwig

Date: Wed May 27 2026 - 10:22:32 EST


On Fri, May 22, 2026 at 11:30:34AM -0400, Chao Shi wrote:
> nvme_find_rq() resolves a device-supplied command id to a request with
> blk_mq_tag_to_rq(), which returns whatever request last used that tag -
> possibly one that is no longer in flight (freed, or never dispatched and
> thus with a NULL rq->mq_hctx). Commit e7006de6c238 ("nvme: code
> command_id with a genctr for use-after-free validation") guards against
> this, but its generation counter is only 4 bits wide and can be matched
> by a malfunctioning or malicious device replaying command ids. The
> driver then completes a request that is not outstanding, dereferencing a
> NULL rq->mq_hctx or double-completing a command:

I don't think an intentionally malicious device is part of the threat
model here. This was added to protect against buggy devices.

> + /*
> + * blk_mq_tag_to_rq() returns whatever request last used this tag, which
> + * may no longer be in flight if the device reports a bogus command id.
> + * Completing it would deref a NULL rq->mq_hctx or double-complete a
> + * command; the 4-bit genctr below only narrows the window.
> + */
> + if (unlikely(blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT)) {
> + dev_err(nvme_req(rq)->ctrl->device,
> + "completion for request %#x not in flight\n", tag);
> + return NULL;
> + }

Although this check looks cheap enough that it should not hurt to add
it. So I think this should be ok, but maybe respin with your planned
commit message update.