Re: [PATCH v8 1/5] mmc: Add MMC host software queue support

From: Ulf Hansson
Date: Tue Feb 11 2020 - 04:45:30 EST


On Wed, 5 Feb 2020 at 13:51, Baolin Wang <baolin.wang7@xxxxxxxxx> wrote:
>
> From: Baolin Wang <baolin.wang@xxxxxxxxxx>
>
> Now the MMC read/write stack will always wait for previous request is
> completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> or queue a work to complete request, that will bring context switching
> overhead, especially for high I/O per second rates, to affect the IO
> performance.

Would you mind adding some more context about the mmc_blk_rw_wait()?
Especially I want to make it clear that mmc_blk_rw_wait() is also used
to poll the card for busy completion for I/O writes, via sending
CMD13.

>
> Thus this patch introduces MMC software queue interface based on the
> hardware command queue engine's interfaces, which is similar with the
> hardware command queue engine's idea, that can remove the context
> switching. Moreover we set the default queue depth as 64 for software
> queue, which allows more requests to be prepared, merged and inserted
> into IO scheduler to improve performance, but we only allow 2 requests
> in flight, that is enough to let the irq handler always trigger the
> next request without a context switch, as well as avoiding a long latency.

I think it's important to clarify that to use this new interface, hsq,
the host controller/driver needs to support HW busy detection for I/O
operations.

In other words, the host driver must not complete a data transfer
request, until after the card stops signals busy. This behaviour is
also required for "closed-ended-transmissions" with CMD23, as in this
path there is no CMD12 sent to complete the transfer, thus no R1B
response flag to trigger the HW busy detection behaviour in the
driver.

>
> From the fio testing data in cover letter, we can see the software
> queue can improve some performance with 4K block size, increasing
> about 16% for random read, increasing about 90% for random write,
> though no obvious improvement for sequential read and write.
>
> Moreover we can expand the software queue interface to support MMC
> packed request or packed command in future.
>
> Reviewed-by: Arnd Bergmann <arnd@xxxxxxxx>
> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx>
> Signed-off-by: Baolin Wang <baolin.wang7@xxxxxxxxx>
> ---

[...]

> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> index f6912de..7a9976f 100644
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -1851,15 +1851,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
> */
> card->reenable_cmdq = card->ext_csd.cmdq_en;
>
> - if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
> + if (host->cqe_ops && !host->cqe_enabled) {
> err = host->cqe_ops->cqe_enable(host, card);
> if (err) {
> pr_err("%s: Failed to enable CQE, error %d\n",
> mmc_hostname(host), err);

This means we are going to start printing an error message for those
eMMCs that doesn't support command queuing, but the host supports
MMC_CAP2_CQE.

Not sure how big of a problem this is, but another option is simply to
leave the logging of the *failures* to the host driver, rather than
doing it here.

Oh well, feel free to change or leave this as is for now. We can
always change it on top, if needed.

> } else {
> host->cqe_enabled = true;
> - pr_info("%s: Command Queue Engine enabled\n",
> - mmc_hostname(host));
> +
> + if (card->ext_csd.cmdq_en) {
> + pr_info("%s: Command Queue Engine enabled\n",
> + mmc_hostname(host));
> + } else {
> + host->hsq_enabled = true;
> + pr_info("%s: Host Software Queue enabled\n",
> + mmc_hostname(host));
> + }
> }
> }

[...]

Kind regards
Uffe