Re: [PATCH v1 1/1] nvme: complete directly for hctx with only one ctx mapping

From: Keith Busch
Date: Tue May 30 2023 - 13:45:59 EST


On Tue, May 30, 2023 at 10:41:19AM +0800, Po-Wen Kao wrote:
> ---
> block/blk-mq.c | 8 +++-----
> drivers/nvme/host/nvme.h | 4 ++++
> 2 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 1749f5890606..b60c78f5ad46 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1181,12 +1181,10 @@ bool blk_mq_complete_request_remote(struct request *rq)
> WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
>
> /*
> - * For request which hctx has only one ctx mapping,
> - * or a polled request, always complete locally,
> - * it's pointless to redirect the completion.
> + * For a polled request, always complete locally, it's pointless
> + * to redirect the completion.
> */
> - if (rq->mq_hctx->nr_ctx == 1 ||
> - rq->cmd_flags & REQ_POLLED)
> + if (rq->cmd_flags & REQ_POLLED)
> return false;
>
> if (blk_mq_complete_need_ipi(rq)) {
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 7cf8e44d135e..acc9b1ce071d 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -702,6 +702,10 @@ static inline bool nvme_try_complete_req(struct request *req, __le16 status,
> nvme_should_fail(req);
> if (unlikely(blk_should_fake_timeout(req->q)))
> return true;
> + if (likely(req->mq_hctx->nr_ctx == 1)) {
> + WRITE_ONCE(req->state, MQ_RQ_COMPLETE);
> + return false;
> + }

I don't think we want low level drivers directly messing with blk-mq
request state.

Is the early nr_ctx check optimisation really worth it? Would the
following work for your use case?

---
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f6dad0886a2fa..a2d65bb127e29 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1176,7 +1176,8 @@ bool blk_mq_complete_request_remote(struct request *rq)
* or a polled request, always complete locally,
* it's pointless to redirect the completion.
*/
- if (rq->mq_hctx->nr_ctx == 1 ||
+ if ((rq->mq_hctx->nr_ctx == 1 &&
+ rq->mq_ctx->cpu == raw_smp_processor_id()) ||
rq->cmd_flags & REQ_POLLED)
return false;
--