RE: REGRESSION BISECTED: mq-deadline: covert to use request_queue->async_depth

From: Doug Smythies

Date: Fri Apr 10 2026 - 01:19:29 EST

On 2026.04.09 05:54 Yu Kuai wrote:
> Hi,
>
> 在 2026/4/9 7:20, Doug Smythies 写道:
>> Hi,
>>
>> I observed a couple of regressions in a workflow (described further below).
>> I have not started to investigate the first regression yet.
>> The first regression was of less magnitude and was introduced somewhere between kernels 6.18 and 6.19-rc1.
>> For the second regression, I bisected the kernel, and then double checked the result:
>>
>>> commit 988bb1b9ededab9aed83df8c1f5be0232b71ded3
>>> Author: Yu Kuai <yukuai@xxxxxxxxx>
>>> Date: Tue Feb 3 16:19:47 2026 +0800
>>> mq-deadline: covert to use request_queue->async_depth
>>>
>>> In downstream kernel, we test with mq-deadline with many fio workloads, and
>>> we found a performance regression after commit 39823b47bbd4
>>> ...
>
> Do you ever use the async_depth for mq-deadline?

No.

> For this patch, the old mq-deadline sysfs api is now removed, if you ever set this, then
> you'll need to set the new api under request_queue.
>
> Otherwise, if you're using the default value before, I don't see how this patch can make
> a difference for IO.

The jobs per second, or disk read rate is the same, but some jobs are forgotten about
for much much longer amounts of time greatly increasing the higher percentile latencies
under conditions where the job requests are backing up.

I'll attach several graphs of a requested jobs per second ramp up type test.

>> Workflow:
>> My version of "critical-jobs", an attempt to do similar to the non-free SPECjbb critical-jOPS.
>>
>> The data base file is 500 Gigabytes.
>> Each requested job does 2 record lookups and a tiny bit of work for each lookup.
>> 340 jobs per second are requested, but the system can only handle an average of
>> about 231 jobs per second. Jobs keep being launched until the maximum number
>> of processes of 1000 is hit, at which time the jobs rate is throttled back.
>> The requested test execution time is extended such that all jobs are completed.
>>
>> The test measures latency between the launch of the job and completion.
>> The issue is that some jobs seem to get forgotten about for a long long time.
>>
>> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz, 6 cores 12 CPUs.
>> HWP: Enabled. Test Disk: HDD Seagate Ironwolf Pro16TB NAS
>> OS: Ubuntu 24.04.4 LTS (server, no GUI). Other than the test, the system is very idle.
>>
>> Kernels used for this email: bad:
>> As of commit 988bb1b9eded called "kuai"
>> With 988bb1b9eded reverted, called "kuai-revert"
>> I.E.:
>>
>> 2d7cf26b0d4b (HEAD -> kuai) Revert "mq-deadline: covert to use request_queue->async_depth"
>> 988bb1b9eded mq-deadline: covert to use request_queue->async_depth
>> 8cbe62f4d8c3 kyber: covert to use request_queue->async_depth
>> f98afe4f31bb blk-mq: add a new queue sysfs attribute async_depth
>>
>> Results:
>>
>> kuai:
>> 340.0 Requested jobs per second.
>> 230.9 Actual jobs per second.
>> 110.4 Execution time (Seconds)
>> 13.8 job latency, 90th percentile (Seconds), 85% regression
>> 21.9 job latency, 95th percentile, 144% regression
>> 40.7 job latency, 99th percentile, 235% regression
>> 99.2 maximum job latency, 313% regression
>>
>> kuai-revert (reference):
>> 340.0 Requested jobs per second.
>> 231.8 Actual jobs per second.
>> 110.0 Execution time (Seconds)
>> 7.47 job latency, 90th percentile (Seconds)
>> 8.96 job latency, 95th percentile
>> 12.14 job latency, 99th percentile
>> 24.01 maximum job latency
>>
>> I have yet to investigate the first regression,
>> but just for readers potential interest,
>> results for a kernel before 6.19-rc1:
>> 340.0 Requested jobs per second.
>> 233.2 Actual jobs per second.
>> 109.4 Execution time (Seconds).
>> 5.7 job latency, 90th percentile (Seconds)
>> 6.32 job latency, 95th percentile
>> 7.37 job latency, 99th percentile
>> 12.60 maximum job latency

The first regression is due to commit 083654ded547:
"cpuidle: governors: teo: Rework the handling of tick wakeups"
However, I haven't written it up yet.
On the graphs those kernels are called "teo" and "teo-revert"
and are not relevant to this thread.

Attachment: 90th.png
Description: PNG image

Attachment: 95th.png
Description: PNG image

Attachment: 99th.png
Description: PNG image

Attachment: max.png
Description: PNG image