Re: [External Mail]Re: [PATCH v3] block: move non sync requests complete flow to softirq
From: 章辉
Date: Wed Sep 04 2024 - 22:47:00 EST
On 2024/9/4 16:01, Ming Lei wrote:
> On Tue, Sep 03, 2024 at 07:54:37PM +0800, ZhangHui wrote:
>> From: zhanghui <zhanghui31@xxxxxxxxxx>
>>
>> Currently, for a controller that supports multiple queues, like UFS4.0,
>> the mq_ops->complete is executed in the interrupt top-half. Therefore,
>> the file system's end io is executed during the request completion process,
>> such as f2fs_write_end_io on smartphone.
>>
>> However, we found that the execution time of the file system end io
>> is strongly related to the size of the bio and the processing speed
>> of the CPU. Because the file system's end io will traverse every page
>> in bio, this is a very time-consuming operation.
>>
>> We measured that the 80M bio write operation on the little CPU will
> What is 80M bio?
>
> It is one known issue that soft lockup may be triggered in case of N:M
> blk-mq mapping, but not sure if that is the case.
>
> What is nr_hw_queues(blk_mq) and nr_cpus in your system?
>
>> cause the execution time of the top-half to be greater than 100ms.
>> The CPU tick on a smartphone is only 4ms, which will undoubtedly affect
>> scheduling efficiency.
> schedule is off too in softirq(bottom-half).
>
>> Let's fixed this issue by moved non sync request completion flow to
>> softirq, and keep the sync request completion in the top-half.
> If you do care interrupt-off or schedule-off latency, you may have to move
> the IO handling into thread context in the driver.
>
> BTW, threaded irq can't help you too.
>
>
> Thanks,
> Ming
>
hi Ming,
Very good reminder, thank you.
On smartphones, nr_hw_queues and nr_cpus are 1:1, I am more concerned
about the interrupt-off latency, which is more obvious on little cores.
Moving time-consuming work to the bottom half may not help with schedule
latency, but it is may helpful for interrupt response latency of other
modules in the system?
Thanks
Zhang