Re: [PATCH] nvme: reject completions for requests that are not in flight
From: Jens Axboe
Date: Wed May 27 2026 - 11:08:28 EST
On 5/27/26 8:19 AM, Christoph Hellwig wrote:
> On Fri, May 22, 2026 at 11:30:34AM -0400, Chao Shi wrote:
>> nvme_find_rq() resolves a device-supplied command id to a request with
>> blk_mq_tag_to_rq(), which returns whatever request last used that tag -
>> possibly one that is no longer in flight (freed, or never dispatched and
>> thus with a NULL rq->mq_hctx). Commit e7006de6c238 ("nvme: code
>> command_id with a genctr for use-after-free validation") guards against
>> this, but its generation counter is only 4 bits wide and can be matched
>> by a malfunctioning or malicious device replaying command ids. The
>> driver then completes a request that is not outstanding, dereferencing a
>> NULL rq->mq_hctx or double-completing a command:
>
> I don't think an intentionally malicious device is part of the threat
> model here. This was added to protect against buggy devices.
Malicious devices are explicitly NOT part of the linux threat model. If
this is a real device, I'd say go talk to whomever made it and get the
firmware fixed. If this is a "hardening" effort to protect against the
threat of malicious devices, then I don't think we should bother.
>> + * blk_mq_tag_to_rq() returns whatever request last used this tag, which
>> + * may no longer be in flight if the device reports a bogus command id.
>> + * Completing it would deref a NULL rq->mq_hctx or double-complete a
>> + * command; the 4-bit genctr below only narrows the window.
>> + */
>> + if (unlikely(blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT)) {
>> + dev_err(nvme_req(rq)->ctrl->device,
>> + "completion for request %#x not in flight\n", tag);
>> + return NULL;
>> + }
>
> Although this check looks cheap enough that it should not hurt to add
> it. So I think this should be ok, but maybe respin with your planned
> commit message update.
Only for the right reasons, imho.
--
Jens Axboe