Re: [PATCH v2 3/3] block: fix ordering between checking BLK_MQ_S_STOPPED and adding requests
From: Muchun Song
Date: Tue Sep 10 2024 - 22:45:16 EST
> On Sep 10, 2024, at 21:22, Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> On 9/3/24 2:16 AM, Muchun Song wrote:
>> Supposing first scenario with a virtio_blk driver.
>>
>> CPU0 CPU1
>>
>> blk_mq_try_issue_directly()
>> __blk_mq_issue_directly()
>> q->mq_ops->queue_rq()
>> virtio_queue_rq()
>> blk_mq_stop_hw_queue()
>> virtblk_done()
>> blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queues()
>> /* Add IO request to dispatch list */ 1) store blk_mq_start_stopped_hw_queue()
>> clear_bit(BLK_MQ_S_STOPPED) 3) store
>> blk_mq_run_hw_queue() blk_mq_run_hw_queue()
>> if (!blk_mq_hctx_has_pending()) if (!blk_mq_hctx_has_pending()) 4) load
>> return return
>> blk_mq_sched_dispatch_requests() blk_mq_sched_dispatch_requests()
>> if (blk_mq_hctx_stopped()) 2) load if (blk_mq_hctx_stopped())
>> return return
>> __blk_mq_sched_dispatch_requests() __blk_mq_sched_dispatch_requests()
>>
>> Supposing another scenario.
>>
>> CPU0 CPU1
>>
>> blk_mq_requeue_work()
>> /* Add IO request to dispatch list */ 1) store virtblk_done()
>> blk_mq_run_hw_queues()/blk_mq_delay_run_hw_queues() blk_mq_start_stopped_hw_queues()
>> if (blk_mq_hctx_stopped()) 2) load blk_mq_start_stopped_hw_queue()
>> continue clear_bit(BLK_MQ_S_STOPPED) 3) store
>> blk_mq_run_hw_queue()/blk_mq_delay_run_hw_queue() blk_mq_run_hw_queue()
>> if (!blk_mq_hctx_has_pending()) 4) load
>> return
>> blk_mq_sched_dispatch_requests()
>>
>> Both scenarios are similar, the full memory barrier should be inserted between
>> 1) and 2), as well as between 3) and 4) to make sure that either CPU0 sees
>> BLK_MQ_S_STOPPED is cleared or CPU1 sees dispatch list. Otherwise, either CPU
>> will not rerun the hardware queue causing starvation of the request.
>>
>> The easy way to fix it is to add the essential full memory barrier into helper
>> of blk_mq_hctx_stopped(). In order to not affect the fast path (hardware queue
>> is not stopped most of the time), we only insert the barrier into the slow path.
>> Actually, only slow path needs to care about missing of dispatching the request
>> to the low-level device driver.
>
> Again, this is way too wide, it's unreadable.
OK. Will do.
Muchun,
Thanks.
>
> Patch looks fine, though.
>
> --
> Jens Axboe