Re: [PATCH v2 2/2] blk-mq: Rerun dispatching in the case of budget contention

From: Doug Anderson
Date: Sun Apr 05 2020 - 12:26:50 EST


Hi,

On Sun, Apr 5, 2020 at 7:55 AM Paolo Valente <paolo.valente@xxxxxxxxxx> wrote:
>
> > Il giorno 5 apr 2020, alle ore 16:00, Doug Anderson <dianders@xxxxxxxxxxxx> ha scritto:
> >
> > Hi,
> >
> > On Sun, Apr 5, 2020 at 2:15 AM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
> >>
> >> OK, looks it isn't specific on BFQ any more.
> >>
> >> Follows another candidate approach for this issue, given it is so hard
> >> to trigger, we can make it more reliable by rerun queue when has_work()
> >> returns true after ops->dispath_request() returns NULL.
> >>
> >> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> >> index 74cedea56034..4408e5d4fcd8 100644
> >> --- a/block/blk-mq-sched.c
> >> +++ b/block/blk-mq-sched.c
> >> @@ -80,6 +80,7 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx)
> >> blk_mq_run_hw_queue(hctx, true);
> >> }
> >>
> >> +#define BLK_MQ_BUDGET_DELAY 3 /* ms units */
> >> /*
> >> * Only SCSI implements .get_budget and .put_budget, and SCSI restarts
> >> * its queue by itself in its completion handler, so we don't need to
> >> @@ -103,6 +104,9 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
> >> rq = e->type->ops.dispatch_request(hctx);
> >> if (!rq) {
> >> blk_mq_put_dispatch_budget(hctx);
> >> +
> >> + if (e->type->ops.has_work && e->type->ops.has_work(hctx))
> >> + blk_mq_delay_run_hw_queue(hctx, BLK_MQ_BUDGET_DELAY);
> >
> > I agree that your patch should solve the race. With the current BFQ's
> > has_work() it's a bit of a disaster though. It will essentially put
> > blk-mq into a busy-wait loop (with a 3 ms delay between each poll)
> > while BFQ's has_work() says "true" but BFQ doesn't dispatch anything.
> >
> > ...so I guess the question that still needs to be answered: does
> > has_work() need to be exact? If so then we need the patch you propose
> > plus one to BFQ. If not, we should continue along the lines of my
> > patch.
> >
>
> Some more comments. BFQ's I/O plugging lasts 9 ms by default. So,
> with this last Ming's patch, BFQ may happen to be polled every 3ms,
> for at most three times.

Ah! I did not know this. OK, then Ming's patch seems like it should
work. If nothing else it should fix the problem. If this ends up
making BFQ chew up too much CPU time then presumably someone will
notice and BFQ's has_work() can be improved.

Ming: how do you want to proceed? Do you want to formally post the
patch? Do you want me to post a v3 of my series where I place patch
#2 with your patch? Do you want authorship (which implies adding your
Signed-off-by)?


> On the opposite end, making bfq_has_work plugging aware costs more
> complexity, and possibly one more lock. While avoiding the above
> occasional polling, this may imply a lot of overhead or CPU stalls on
> every dispatch.

I still think it would be interesting to run performance tests with my
proof-of-concept solution for has_work(). Even if it's not ideal,
knowing whether performance increased, decreased, or stayed the same
would give information about how much more effort should be put into
this.

-Doug