Re: [External Mail]Re: [PATCH v6] block: move non sync requests complete flow to softirq

From: Jens Axboe
Date: Mon Sep 09 2024 - 09:25:08 EST


On 9/8/24 8:17 PM, ?? wrote:
> On 2024/9/7 21:46, Jens Axboe wrote:
>> On 9/6/24 8:49 PM, ZhangHui wrote:
>>> From: zhanghui <zhanghui31@xxxxxxxxxx>
>>>
>>> Currently, for a controller that supports multiple queues, like UFS4.0,
>>> the mq_ops->complete is executed in the interrupt top-half. Therefore,
>>> the file system's end io is executed during the request completion process,
>>> such as f2fs_write_end_io on smartphone.
>>>
>>> However, we found that the execution time of the file system end io
>>> is strongly related to the size of the bio and the processing speed
>>> of the CPU. Because the file system's end io will traverse every page
>>> in bio, this is a very time-consuming operation.
>>>
>>> We measured that the 80M bio write operation on the little CPU will
>>> cause the execution time of the top-half to be greater than 100ms,
>>> which will undoubtedly affect interrupt response latency.
>>>
>>> Let's fix this issue by moving non sync requests completion to softirq
>>> context, and keeping sync requests completion in the IRQ top-half context.
>> You keep ignoring the feedback, and hence I too shall be ignoring this
>> patch going forward then.
>>
>> The key issue here is that the completion takes so long, and adding a
>> heuristic that equates not-sync with latency-not-important is pretty
>> bogus and not a good way to attempt to work around it.
>>
>> --
>> Jens Axboe
>>
> hi Jens,
>
> Sorry for not replying in time.
>
> We have basically determined the plan for the f2fs side. The short-term
> plan is to limit the size of a single bio, and the long-term plan is to
> change f2fs from page to folio to reduce the pagecache traversal time.
>
> However, I think it also makes sense to move less urgent work out of the
> IRQ top-half.

What you are missing is that what you think is less urgent, may be just
as urgent as other requests to others. !rq_is_sync() doesn't mean that
it's necessarily a background or low priority request. So no, I'm not
interested in merging an odd work-around for what is really a different
issue.

Fixing f2fs is indeed the right way, and I'd suggest in the mean time
you just limit the per-request size to something a lot more reasonable.
If you see high latencies with 80MB requests, then perhaps don't be
doing 80MB requests. That should be well beyond the diminishing returns
point for bandwidth anyway, there's no reason why anyone should be doing
requests that huge and not expect longer processing times.

--
Jens Axboe