Re: [PATCH 4/5] blk-mq: do limited block plug for multiple queue case
From: Shaohua Li
Date: Mon May 04 2015 - 15:41:52 EST
On Fri, May 01, 2015 at 04:16:04PM -0400, Jeff Moyer wrote:
> Shaohua Li <shli@xxxxxx> writes:
>
> > plug is still helpful for workload with IO merge, but it can be harmful
> > otherwise especially with multiple hardware queues, as there is
> > (supposed) no lock contention in this case and plug can introduce
> > latency. For multiple queues, we do limited plug, eg plug only if there
> > is request merge. If a request doesn't have merge with following
> > request, the requet will be dispatched immediately.
> >
> > This also fixes a bug. If we directly issue a request and it fails, we
> > use blk_mq_merge_queue_io(). But we already assigned bio to a request in
> > blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
> > blk_mq_bio_to_request again.
>
> Good catch. Might've been better to split that out first for easy
> backport to stable kernels, but I won't hold you to that.
It's not a severe bug, but I don't mind. Jens, please let me know if I
should split the patch into 2 patches.
> > @@ -1243,6 +1277,10 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
> > return;
> > }
> >
> > + if (likely(!is_flush_fua) && !blk_queue_nomerges(q) &&
> > + blk_attempt_plug_merge(q, bio, &request_count))
> > + return;
> > +
> > rq = blk_mq_map_request(q, bio, &data);
> > if (unlikely(!rq))
> > return;
>
> After this patch, everything up to this point in blk_mq_make_request and
> blk_sq_make_request is the same. This can be factored out (in another
> patch) to a common function.
I'll leave this for a separate cleanup if a good function name is found.
> > @@ -1253,38 +1291,38 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
> > goto run_queue;
> > }
> >
> > + plug = current->plug;
> > /*
> > * If the driver supports defer issued based on 'last', then
> > * queue it up like normal since we can potentially save some
> > * CPU this way.
> > */
> > - if (is_sync && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) {
> > - struct blk_mq_queue_data bd = {
> > - .rq = rq,
> > - .list = NULL,
> > - .last = 1
> > - };
> > - int ret;
> > + if ((plug || is_sync) && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) {
> > + struct request *old_rq = NULL;
>
> I would add a !blk_queue_nomerges(q) to that conditional. There's no
> point holding back an I/O when we won't merge it anyway.
Good catch! Fixed.
> That brings up another quirk of the current implementation (not your
> patches) that bugs me.
>
> BLK_MQ_F_SHOULD_MERGE
> QUEUE_FLAG_NOMERGES
>
> Those two flags are set independently, one via the driver and the other
> via a sysfs file. So the user could set the nomerges flag to 1 or 2,
> and still potentially get merges (see blk_mq_merge_queue_io). That's
> something that should be fixed, albeit that can wait.
Agree
> > blk_mq_bio_to_request(rq, bio);
> >
> > /*
> > - * For OK queue, we are done. For error, kill it. Any other
> > - * error (busy), just add it to our list as we previously
> > - * would have done
> > + * we do limited pluging. If bio can be merged, do merge.
> > + * Otherwise the existing request in the plug list will be
> > + * issued. So the plug list will have one request at most
> > */
> > - ret = q->mq_ops->queue_rq(data.hctx, &bd);
> > - if (ret == BLK_MQ_RQ_QUEUE_OK)
> > - goto done;
> > - else {
> > - __blk_mq_requeue_request(rq);
> > -
> > - if (ret == BLK_MQ_RQ_QUEUE_ERROR) {
> > - rq->errors = -EIO;
> > - blk_mq_end_request(rq, rq->errors);
> > - goto done;
> > + if (plug) {
> > + if (!list_empty(&plug->mq_list)) {
> > + old_rq = list_first_entry(&plug->mq_list,
> > + struct request, queuelist);
> > + list_del_init(&old_rq->queuelist);
> > }
> > - }
> > + list_add_tail(&rq->queuelist, &plug->mq_list);
> > + } else /* is_sync */
> > + old_rq = rq;
> > + blk_mq_put_ctx(data.ctx);
> > + if (!old_rq)
> > + return;
> > + if (!blk_mq_direct_issue_request(old_rq))
> > + return;
> > + blk_mq_insert_request(old_rq, false, true, true);
> > + return;
> > }
>
> Now there is no way to exit that if block, we always return. It may be
> worth cosidering moving that block to its own function, if you can think
> of a good name for it.
I'll leave this for a later work
> Other than those minor issues, this looks good to me.
Thanks for your time!