Re: EEVDF regression still exists

From: Prundeanu, Cristian
Date: Thu May 01 2025 - 12:16:22 EST


Hi Prateek,

On 2025-04-29, 22:33, "K Prateek Nayak" <kprateek.nayak@xxxxxxx <mailto:kprateek.nayak@xxxxxxx>> wrote:

>>>> Here are the latest results for the EEVDF impact on database workloads.
>>>> The regression introduced in kernel 6.6 still persists and doesn't look
>>>> like it is improving.
>>>
>>> Well, I was under the impression it had actually been solved :-(
>>>
>>> My understanding from the last round was that Prateek and co had it
>>> sorted -- with the caveat being that you had to stick SCHED_BATCH in at
>>> the right place in MySQL start scripts or somesuch.
>>
>> The statement in the previous thread [1] was that using SCHED_BATCH improves
>> performance over default. While that still holds true, it is also equally true
>> about using SCHED_BATCH on kernel 6.5.
>>
>> So, when we compare 6.5 with recent kernels, both using SCHED_BATCH, the
>> regression is still visible. (Previously, we only compared SCHED_BATCH with
>> 6.5 default, leading to the wrong conclusion that it's a fix).
>
> P.S. Are the numbers for v6.15-rc4 + SCHED_BATCH comparable to v6.5
> default?

SCHED_BATCH does improve the performance both on 6.5 and on 6.12+; in my
testing, 6.12-SCHED_BATCH does not quite reach the 6.5-default (without
SCHED_BATCH) performance. Best case (6.15-rc3-SCHED_BATCH) is -3.6%, and
worst case (6.15-rc4-SCHED_BATCH) is -7.0% when compared to 6.5.13-default.

(Please keep in mind that the target isn't to get SCHED_BATCH to the same
level as 6.5-default; it's to resolve the regression from 6.5-default to
6.6+ default, and from 6.5-SCHED_BATCH to 6.6+ SCHED_BATCH).

> One more curious question: Does changing the base slice to a larger
> value (say 6ms) in conjunction with setting SCHED_BATCH on v6.15-rc4
> affect the benchmark result in any way?

I reran 6.15-rc4, with both 3ms (default) and 6ms. The larger base slice
slightly improves performance, more for SCHED_BATCH than for default.

6ms compared to 3ms same kernel (not compared to 6.5):

Kernel | Throughput | Latency
---------------------+------------+---------
6.15-rc4 default | +1.1% | -1.3%
6.15-rc4 SCHED_BATCH | +2.9% | -2.7%

Full details, reports and data:
https://github.com/aws/repro-collection/blob/main/repros/repro-mysql-EEVDF-regression/results/20250430/README.md
(These perf files all have the same schedstat version, hopefully "perf
sched stats diff" worked better this time).

-Cristian