Re: EEVDF regression still exists

From: K Prateek Nayak
Date: Fri May 02 2025 - 01:56:21 EST


Hello Cristian,

On 5/1/2025 9:46 PM, Prundeanu, Cristian wrote:
Hi Prateek,

On 2025-04-29, 22:33, "K Prateek Nayak" <kprateek.nayak@xxxxxxx <mailto:kprateek.nayak@xxxxxxx>> wrote:

Here are the latest results for the EEVDF impact on database workloads.
The regression introduced in kernel 6.6 still persists and doesn't look
like it is improving.

Well, I was under the impression it had actually been solved :-(

My understanding from the last round was that Prateek and co had it
sorted -- with the caveat being that you had to stick SCHED_BATCH in at
the right place in MySQL start scripts or somesuch.

The statement in the previous thread [1] was that using SCHED_BATCH improves
performance over default. While that still holds true, it is also equally true
about using SCHED_BATCH on kernel 6.5.

So, when we compare 6.5 with recent kernels, both using SCHED_BATCH, the
regression is still visible. (Previously, we only compared SCHED_BATCH with
6.5 default, leading to the wrong conclusion that it's a fix).

P.S. Are the numbers for v6.15-rc4 + SCHED_BATCH comparable to v6.5
default?

SCHED_BATCH does improve the performance both on 6.5 and on 6.12+; in my
testing, 6.12-SCHED_BATCH does not quite reach the 6.5-default (without
SCHED_BATCH) performance. Best case (6.15-rc3-SCHED_BATCH) is -3.6%, and
worst case (6.15-rc4-SCHED_BATCH) is -7.0% when compared to 6.5.13-default.

(Please keep in mind that the target isn't to get SCHED_BATCH to the same
level as 6.5-default; it's to resolve the regression from 6.5-default to
6.6+ default, and from 6.5-SCHED_BATCH to 6.6+ SCHED_BATCH).

Ack! I was just curious if all of the performance drop can be
attributed to aggressive wakeup preemption or not.


One more curious question: Does changing the base slice to a larger
value (say 6ms) in conjunction with setting SCHED_BATCH on v6.15-rc4
affect the benchmark result in any way?

I reran 6.15-rc4, with both 3ms (default) and 6ms. The larger base slice
slightly improves performance, more for SCHED_BATCH than for default.

6ms compared to 3ms same kernel (not compared to 6.5):

Kernel | Throughput | Latency
---------------------+------------+---------
6.15-rc4 default | +1.1% | -1.3%
6.15-rc4 SCHED_BATCH | +2.9% | -2.7%

Full details, reports and data:
https://github.com/aws/repro-collection/blob/main/repros/repro-mysql-EEVDF-regression/results/20250430/README.md
(These perf files all have the same schedstat version, hopefully "perf
sched stats diff" worked better this time).

Thank you for the information. Ravi and Swapnil are working to
get perf sched stats diff to behave well when comparing different
versions. It should be fixed in subsequent versions.

P.S. I'm still setting up the system and have got my SUT pretty
close to what you have described. I couldn't quite reproduce the
regression on baremetal with my previous configuration on v6.15-rc4.

Could you also provide some information on your LDG machine - its
configuration and he kernel it is running (although this shouldn't
really matter as long as it is same across runs)


-Cristian


--
Thanks and Regards,
Prateek