Re: [PATCH] block: add max_dispatch to sysfs

From: Damien Le Moal
Date: Thu Apr 11 2024 - 01:19:25 EST


On 4/10/24 22:17, Jens Axboe wrote:
> On 4/10/24 4:18 AM, Dongliang Cui wrote:
>> The default configuration in the current code is that when the device
>> is not busy, a single dispatch will attempt to pull 'nr_requests'
>> requests out of the schedule queue.
>>
>> I tried to track the dispatch process:
>>
>> COMM TYPE SEC_START IOPRIO INDEX
>> fio-17304 R 196798040 0x2005 0
>> fio-17306 R 197060504 0x2005 1
>> fio-17307 R 197346904 0x2005 2
>> fio-17308 R 197609400 0x2005 3
>> fio-17309 R 197873048 0x2005 4
>> fio-17310 R 198134936 0x2005 5
>> ...
>> fio-17237 R 197122936 0x0 57
>> fio-17238 R 197384984 0x0 58
>> <...>-17239 R 197647128 0x0 59
>> fio-17240 R 197909208 0x0 60
>> fio-17241 R 198171320 0x0 61
>> fio-17242 R 198433432 0x0 62
>> fio-17300 R 195744088 0x2005 0
>> fio-17301 R 196008504 0x2005 0
>>
>> The above data is calculated based on the block event trace, with each
>> column containing: process name, request type, sector start address,
>> IO priority.
>>
>> The INDEX represents the order in which the requests are extracted from
>> the scheduler queue during a single dispatch process.
>>
>> Some low-speed devices cannot process these requests at once, and they will
>> be requeued to hctx->dispatch and wait for the next issuance.
>>
>> There will be a problem here, when the IO priority is enabled, if you try
>> to dispatch "nr_request" requests at once, the IO priority will be ignored
>> from the scheduler queue and all requests will be extracted.
>>
>> In this scenario, if a high priority request is inserted into the scheduler
>> queue, it needs to wait for the low priority request in the hctx->dispatch
>> to be processed first.
>>
>> --------------------dispatch 1st----------------------
>> fio-17241 R 198171320 0x0 61
>> fio-17242 R 198433432 0x0 62
>> --------------------dispatch 2nd----------------------
>> fio-17300 R 195744088 0x2005 0
>>
>> In certain scenarios, we hope that requests can be processed in order of io
>> priority as much as possible.
>>
>> Maybe max_dispatch should not be a fixed value, but can be adjusted
>> according to device conditions.
>>
>> So we give a interface to control the maximum value of single dispatch
>> so that users can configure it according to devices characteristics.
>
> I agree that pulling 'nr_requests' out of the scheduler will kind of
> defeat the purpose of the scheduler to some extent. But rather than add
> another knob that nobody knows about or ever will touch (and extra queue
> variables that just take up space), why not just default to something a
> bit saner? Eg we could default to 1/8 or 1/4 of the scheduler depth
> instead.

Why not default to pulling what can actually be executed, that is, up to the
number of free hw tags / budget ? Anything more than that will be requeued anyway.

--
Damien Le Moal
Western Digital Research