Re: [PATCH] nvme-pci: cancel nvme device request before disabling

From: Keith Busch
Date: Fri Aug 14 2020 - 11:42:39 EST


On Fri, Aug 14, 2020 at 11:37:20AM -0400, Tong Zhang wrote:
> On Fri, Aug 14, 2020 at 11:04 AM Keith Busch <kbusch@xxxxxxxxxx> wrote:
> >
> > On Fri, Aug 14, 2020 at 03:14:31AM -0400, Tong Zhang wrote:
> > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > > index ba725ae47305..c4f1ce0ee1e3 100644
> > > --- a/drivers/nvme/host/pci.c
> > > +++ b/drivers/nvme/host/pci.c
> > > @@ -1249,8 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
> > > dev_warn_ratelimited(dev->ctrl.device,
> > > "I/O %d QID %d timeout, disable controller\n",
> > > req->tag, nvmeq->qid);
> > > - nvme_dev_disable(dev, true);
> > > nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> > > + nvme_dev_disable(dev, true);
> > > return BLK_EH_DONE;
> >
> > Shouldn't this flag have been set in nvme_cancel_request()?
>
> nvme_cancel_request() is not setting this flag to cancelled and this is causing

Right, I see that it doesn't, but I'm saying that it should. We used to
do something like that, and I'm struggling to recall why we're not
anymore. The driver is not reporting non-response back for all
cancelled requests, and that is probably not what we should be doing.