Re: [PATCH v3 1/1] nvme: multipath: Implemented new iopolicy "queue-depth"

From: Keith Busch
Date: Tue May 21 2024 - 09:06:12 EST


On Tue, May 21, 2024 at 02:18:09PM +0530, Nilay Shroff wrote:
> On 5/21/24 01:50, John Meneghini wrote:
> > @@ -140,8 +148,12 @@ void nvme_mpath_end_request(struct request *rq)
> > {
> > struct nvme_ns *ns = rq->q->queuedata;
> >
> > + if ((nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE))
> > + atomic_dec_if_positive(&ns->ctrl->nr_active);
> > +
> > if (!(nvme_req(rq)->flags & NVME_MPATH_IO_STATS))
> > return;
> > +
> > bdev_end_io_acct(ns->head->disk->part0, req_op(rq),
> > blk_rq_bytes(rq) >> SECTOR_SHIFT,
> > nvme_req(rq)->start_time);
> > @@ -330,6 +342,40 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
> > return found;
> > }
> >
> I think you may also want to reset nr_active counter if in case, in-flight nvme request
> is cancelled. If the request is cancelled then nvme_mpath_end_request() wouldn't be invoked.
> So you may want to reset nr_active counter from nvme_cancel_request() as below:
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index bf7615cb36ee..4fea7883ce8e 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -497,8 +497,9 @@ EXPORT_SYMBOL_GPL(nvme_host_path_error);
>
> bool nvme_cancel_request(struct request *req, void *data)
> {
> - dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
> - "Cancelling I/O %d", req->tag);
> + struct nvme_ctrl *ctrl = (struct nvme_ctrl *)data;
> +
> + dev_dbg_ratelimited(ctrl->device, "Cancelling I/O %d", req->tag);
>
> /* don't abort one completed or idle request */
> if (blk_mq_rq_state(req) != MQ_RQ_IN_FLIGHT)
> @@ -506,6 +507,8 @@ bool nvme_cancel_request(struct request *req, void *data)
>
> nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;
> nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> + if ((nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE))
> + atomic_dec(&ctrl->nr_active);
> blk_mq_complete_request(req);
> return true;
> }

The io stats wouldn't be right if that happened. And maybe it isn't
right on a failover, but it needs to be. Would it work if
nvme_failover_req() calls nvme_end_req() instead of directly calling
blk_mq_end_req()?