[PATCH 4/4] block: fix fix ordering between checking QUEUE_FLAG_QUIESCED and adding requests to hctx->dispatch

From: Muchun Song
Date: Sun Aug 11 2024 - 06:20:50 EST


Supposing the following scenario.

CPU0 CPU1

blk_mq_request_issue_directly() blk_mq_unquiesce_queue()
if (blk_queue_quiesced()) blk_queue_flag_clear(QUEUE_FLAG_QUIESCED) 3) store
blk_mq_insert_request() blk_mq_run_hw_queues()
/* blk_mq_run_hw_queue()
* Add request to dispatch list or set bitmap of if (!blk_mq_hctx_has_pending()) 4) load
* software queue. 1) store return
*/
blk_mq_run_hw_queue()
if (blk_queue_quiesced()) 2) load
return
blk_mq_sched_dispatch_requests()

The full memory barrier should be inserted between 1) and 2), as well as
between 3) and 4) to make sure that either CPU0 sees QUEUE_FLAG_QUIESCED is
cleared or CPU1 sees dispatch list or setting of bitmap of software queue.
Otherwise, either CPU will not re-run the hardware queue causing starvation.

Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
---
block/blk-mq.c | 38 +++++++++++++++++++++++++++-----------
1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 385a74e566874..66b21407a9a6c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -264,6 +264,13 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
;
} else if (!--q->quiesce_depth) {
blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
+ /**
+ * The need of memory barrier is in blk_mq_run_hw_queues() to
+ * make sure clearing of QUEUE_FLAG_QUIESCED is before the
+ * checking of dispatch list or bitmap of any software queue.
+ *
+ * smp_mb__after_atomic();
+ */
run_queue = true;
}
spin_unlock_irqrestore(&q->queue_lock, flags);
@@ -2222,6 +2229,21 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
{
bool need_run;

+ /*
+ * This barrier is used to order adding of dispatch list or setting
+ * of bitmap of any software queue outside of this function and the
+ * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the
+ * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could
+ * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting
+ * of bitmap of any software queue to avoid missing dispatching
+ * requests.
+ *
+ * This barrier is also used to order adding of dispatch list or
+ * setting of bitmap of any software queue outside of this function
+ * and test of QUEUE_FLAG_QUIESCED below.
+ */
+ smp_mb();
+
/*
* We can't run the queue inline with interrupts disabled.
*/
@@ -2244,17 +2266,6 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
if (!need_run)
return;

- /*
- * This barrier is used to order adding of dispatch list or setting
- * of bitmap of any software queue outside of this function and the
- * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the
- * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could
- * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting
- * of bitmap of any software queue to avoid missing dispatching
- * requests.
- */
- smp_mb();
-
if (async || !cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)) {
blk_mq_delay_run_hw_queue(hctx, 0);
return;
@@ -2308,6 +2319,11 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async)
* either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting
* of bitmap of any software queue to avoid missing dispatching
* requests.
+ *
+ * This barrier is also used to order clearing of QUEUE_FLAG_QUIESCED
+ * outside of this function in blk_mq_unquiesce_queue() and checking
+ * of dispatch list or bitmap of any software queue in
+ * blk_mq_run_hw_queue().
*/
smp_mb();

--
2.20.1