Re: [PATCH v8 1/5] mmc: Add MMC host software queue support

From: Baolin Wang
Date: Tue Feb 11 2020 - 07:52:03 EST


On Tue, Feb 11, 2020 at 5:45 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Wed, 5 Feb 2020 at 13:51, Baolin Wang <baolin.wang7@xxxxxxxxx> wrote:
> >
> > From: Baolin Wang <baolin.wang@xxxxxxxxxx>
> >
> > Now the MMC read/write stack will always wait for previous request is
> > completed by mmc_blk_rw_wait(), before sending a new request to hardware,
> > or queue a work to complete request, that will bring context switching
> > overhead, especially for high I/O per second rates, to affect the IO
> > performance.
>
> Would you mind adding some more context about the mmc_blk_rw_wait()?
> Especially I want to make it clear that mmc_blk_rw_wait() is also used
> to poll the card for busy completion for I/O writes, via sending
> CMD13.

Sure.

>
> >
> > Thus this patch introduces MMC software queue interface based on the
> > hardware command queue engine's interfaces, which is similar with the
> > hardware command queue engine's idea, that can remove the context
> > switching. Moreover we set the default queue depth as 64 for software
> > queue, which allows more requests to be prepared, merged and inserted
> > into IO scheduler to improve performance, but we only allow 2 requests
> > in flight, that is enough to let the irq handler always trigger the
> > next request without a context switch, as well as avoiding a long latency.
>
> I think it's important to clarify that to use this new interface, hsq,
> the host controller/driver needs to support HW busy detection for I/O
> operations.
>
> In other words, the host driver must not complete a data transfer
> request, until after the card stops signals busy. This behaviour is
> also required for "closed-ended-transmissions" with CMD23, as in this
> path there is no CMD12 sent to complete the transfer, thus no R1B
> response flag to trigger the HW busy detection behaviour in the
> driver.

Sure.

>
> >
> > From the fio testing data in cover letter, we can see the software
> > queue can improve some performance with 4K block size, increasing
> > about 16% for random read, increasing about 90% for random write,
> > though no obvious improvement for sequential read and write.
> >
> > Moreover we can expand the software queue interface to support MMC
> > packed request or packed command in future.
> >
> > Reviewed-by: Arnd Bergmann <arnd@xxxxxxxx>
> > Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx>
> > Signed-off-by: Baolin Wang <baolin.wang7@xxxxxxxxx>
> > ---
>
> [...]
>
> > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> > index f6912de..7a9976f 100644
> > --- a/drivers/mmc/core/mmc.c
> > +++ b/drivers/mmc/core/mmc.c
> > @@ -1851,15 +1851,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
> > */
> > card->reenable_cmdq = card->ext_csd.cmdq_en;
> >
> > - if (card->ext_csd.cmdq_en && !host->cqe_enabled) {
> > + if (host->cqe_ops && !host->cqe_enabled) {
> > err = host->cqe_ops->cqe_enable(host, card);
> > if (err) {
> > pr_err("%s: Failed to enable CQE, error %d\n",
> > mmc_hostname(host), err);
>
> This means we are going to start printing an error message for those
> eMMCs that doesn't support command queuing, but the host supports
> MMC_CAP2_CQE.
>
> Not sure how big of a problem this is, but another option is simply to
> leave the logging of the *failures* to the host driver, rather than
> doing it here.
>
> Oh well, feel free to change or leave this as is for now. We can
> always change it on top, if needed.

OK. I will move the failure log to cqe_enable() callback to keep the
same logs' logic. Thanks.

> > } else {
> > host->cqe_enabled = true;
> > - pr_info("%s: Command Queue Engine enabled\n",
> > - mmc_hostname(host));
> > +
> > + if (card->ext_csd.cmdq_en) {
> > + pr_info("%s: Command Queue Engine enabled\n",
> > + mmc_hostname(host));
> > + } else {
> > + host->hsq_enabled = true;
> > + pr_info("%s: Host Software Queue enabled\n",
> > + mmc_hostname(host));
> > + }
> > }
> > }
>
> [...]
>
> Kind regards
> Uffe