Re: [PATCH RFC 1/3] block/mq-deadline: Revert "block/mq-deadline: Fix the tag reservation code"

From: Yu Kuai
Date: Tue Dec 10 2024 - 01:23:00 EST


Hi,

在 2024/12/10 9:50, Yu Kuai 写道:
Hi,

在 2024/12/10 2:02, Bart Van Assche 写道:
This is not correct. dd->async_depth can be modified via sysfs.

How about the following patch to fix min_shallow_depth for deadline?

Thanks,
Kuai

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index a9cf8e19f9d1..040ebb0b192d 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -667,8 +667,7 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
        struct blk_mq_tags *tags = hctx->sched_tags;

        dd->async_depth = q->nr_requests;
-
-       sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
+       sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
 }

 /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
@@ -1012,6 +1011,47 @@ SHOW_INT(deadline_fifo_batch_show, dd->fifo_batch);
 #undef SHOW_INT
 #undef SHOW_JIFFIES

+static ssize_t deadline_async_depth_store(struct elevator_queue *e,
+                                         const char *page, size_t count)
+{
+       struct deadline_data *dd = e->elevator_data;
+       struct request_queue *q = dd->q;
+       struct blk_mq_hw_ctx *hctx;
+       unsigned long i;
+       int v;
+       int ret = kstrtoint(page, 0, &v);
+
+       if (ret < 0)
+               return ret;
+
+       if (v < 1)
+               v = 1;
+       else if (v > dd->q->nr_requests)
+               v = dd->q->nr_requests;
+
+       if (v == dd->async_depth)
+               return count;
+
+       blk_mq_freeze_queue(q);
+       blk_mq_quiesce_queue(q);
+
+       dd->async_depth = v;
+       if (blk_mq_is_shared_tags(q->tag_set->flags)) {
+               sbitmap_queue_min_shallow_depth(
+                       &q->sched_shared_tags->bitmap_tags, dd->async_depth);
+       } else {
+               queue_for_each_hw_ctx(q, hctx, i)
+                       sbitmap_queue_min_shallow_depth(
+                               &hctx->sched_tags->bitmap_tags,
+                               dd->async_depth);
+       }

Just realized that this is not ok, q->sysfs_lock must be held to protect
changing hctx, however, the lock ordering is q->sysfs_lock before
eq->sysfs_lock, and this context already hold eq->sysfs_lock.

First of all, are we in the agreement that it's not acceptable to
sacrifice performance in the default scenario just to make sure
functional correctness if async_depth is set to 1?

If so, following are the options that I can think of to fix this:

1) make async_depth read-only, if 75% tags will hurt performance in some
cases, user can increase nr_requests to prevent it.
2) refactor elevator sysfs api, remove eq->sysfs_lock and replace it
with q->sysfs_lock, so deadline_async_depth_store() will be protected
against changing hctxs, and min_shallow_depth can be updated here.
3) other options?

Thanks,
Kuai