Re: [PATCH] nvme: fix NULL pointer dereference

From: Tong Zhang
Date: Thu Sep 17 2020 - 23:32:25 EST


Please correct me if I am wrong.
After a bit more digging I found out that it is indeed command_id got
corrupted is causing this problem. Although the tag and command_id
range is checked like you said, the elements in rqs cannot be
guaranteed to be not NULL. thus although the range check is passed,
blk_mq_tag_to_rq() can still return NULL. It is clear that the current
sanitization is not enough and there's more implication about this --
when all rqs got populated, a corrupted command_id may silently
corrupt other data not belonging to the current command.

- Tong

On Thu, Sep 17, 2020 at 8:44 PM Tong Zhang <ztong0001@xxxxxxxxx> wrote:
>
> Hmm..Yeah.. I see your point.
> I was naivly thinking the command_id was the culprit.
>
> On Thu, Sep 17, 2020 at 1:14 PM Keith Busch <kbusch@xxxxxxxxxx> wrote:
> >
> > On Thu, Sep 17, 2020 at 12:56:59PM -0400, Tong Zhang wrote:
> > > The command_id in CQE is writable by NVMe controller, driver should
> > > check its sanity before using it.
> >
> > We already do that.