Re: [PATCH] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags
From: Ming Lei
Date: Mon Oct 18 2021 - 05:08:27 EST
On Mon, Oct 18, 2021 at 09:08:57AM +0100, John Garry wrote:
> On 13/10/2021 16:13, John Garry wrote:
> > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> > > index 72a2724a4eee..2a2ad6dfcc33 100644
> > > --- a/block/blk-mq-tag.c
> > > +++ b/block/blk-mq-tag.c
> > > @@ -232,8 +232,9 @@ static bool bt_iter(struct sbitmap *bitmap,
> > > unsigned int bitnr, void *data)
> > > if (!rq)
> > > return true;
> > > - if (rq->q == hctx->queue && rq->mq_hctx == hctx)
> > > - ret = iter_data->fn(hctx, rq, iter_data->data, reserved);
> > > + if (rq->q == hctx->queue && (rq->mq_hctx == hctx ||
> > > + blk_mq_is_shared_tags(hctx->flags)))
> > > + ret = iter_data->fn(rq->mq_hctx, rq, iter_data->data, reserved);
> > > blk_mq_put_rq_ref(rq);
> > > return ret;
> > > }
> > > @@ -460,6 +461,9 @@ void blk_mq_queue_tag_busy_iter(struct
> > > request_queue *q, busy_iter_fn *fn,
> > > if (tags->nr_reserved_tags)
> > > bt_for_each(hctx, &tags->breserved_tags, fn, priv, true);
> > > bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false);
> > > +
> > > + if (blk_mq_is_shared_tags(hctx->flags))
> > > + break;
> > > }
> > > blk_queue_exit(q);
> > > }
> > >
> >
> > I suppose that is ok, and means that we iter once.
> >
> > However, I have to ask, where is the big user of
> > blk_mq_queue_tag_busy_iter() coming from? I saw this from Kashyap's
> > mail:
> >
> > > 1.31% 1.31% kworker/57:1H-k [kernel.vmlinux]
> > > native_queued_spin_lock_slowpath
> > > ret_from_fork
> > > kthread
> > > worker_thread
> > > process_one_work
> > > blk_mq_timeout_work
> > > blk_mq_queue_tag_busy_iter
> > > bt_iter
> > > blk_mq_find_and_get_req
> > > _raw_spin_lock_irqsave
> > > native_queued_spin_lock_slowpath
> >
> > How or why blk_mq_timeout_work()?
>
> Just some update: I tried hisi_sas with 10x SAS SSDs, megaraid sas with 1x
> SATA HDD (that's all I have), and null blk with lots of devices, and I still
> can't see high usage of blk_mq_queue_tag_busy_iter().
It should be triggered easily in case of heavy io accounting:
while true; do cat /proc/diskstats; done
> So how about we get this patch processed (to fix blk_mq_tagset_busy_iter()),
> as it is independent of blk_mq_queue_tag_busy_iter()? And then wait for some
> update or some more info from Kashyap regarding blk_mq_queue_tag_busy_iter()
Looks fine:
Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>
Thanks,
Ming