Re: [PATCH v2 3/3] block: fix ordering between checking BLK_MQ_S_STOPPED and adding requests

From: Jens Axboe
Date: Tue Sep 10 2024 - 09:23:12 EST


On 9/3/24 2:16 AM, Muchun Song wrote:
> Supposing first scenario with a virtio_blk driver.
>
> CPU0 CPU1
>
> blk_mq_try_issue_directly()
> __blk_mq_issue_directly()
> q->mq_ops->queue_rq()
> virtio_queue_rq()
> blk_mq_stop_hw_queue()
> virtblk_done()
> blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queues()
> /* Add IO request to dispatch list */ 1) store blk_mq_start_stopped_hw_queue()
> clear_bit(BLK_MQ_S_STOPPED) 3) store
> blk_mq_run_hw_queue() blk_mq_run_hw_queue()
> if (!blk_mq_hctx_has_pending()) if (!blk_mq_hctx_has_pending()) 4) load
> return return
> blk_mq_sched_dispatch_requests() blk_mq_sched_dispatch_requests()
> if (blk_mq_hctx_stopped()) 2) load if (blk_mq_hctx_stopped())
> return return
> __blk_mq_sched_dispatch_requests() __blk_mq_sched_dispatch_requests()
>
> Supposing another scenario.
>
> CPU0 CPU1
>
> blk_mq_requeue_work()
> /* Add IO request to dispatch list */ 1) store virtblk_done()
> blk_mq_run_hw_queues()/blk_mq_delay_run_hw_queues() blk_mq_start_stopped_hw_queues()
> if (blk_mq_hctx_stopped()) 2) load blk_mq_start_stopped_hw_queue()
> continue clear_bit(BLK_MQ_S_STOPPED) 3) store
> blk_mq_run_hw_queue()/blk_mq_delay_run_hw_queue() blk_mq_run_hw_queue()
> if (!blk_mq_hctx_has_pending()) 4) load
> return
> blk_mq_sched_dispatch_requests()
>
> Both scenarios are similar, the full memory barrier should be inserted between
> 1) and 2), as well as between 3) and 4) to make sure that either CPU0 sees
> BLK_MQ_S_STOPPED is cleared or CPU1 sees dispatch list. Otherwise, either CPU
> will not rerun the hardware queue causing starvation of the request.
>
> The easy way to fix it is to add the essential full memory barrier into helper
> of blk_mq_hctx_stopped(). In order to not affect the fast path (hardware queue
> is not stopped most of the time), we only insert the barrier into the slow path.
> Actually, only slow path needs to care about missing of dispatching the request
> to the low-level device driver.

Again, this is way too wide, it's unreadable.

Patch looks fine, though.

--
Jens Axboe