[RFC][PATCH 00/10] sched/fair: Complete EEVDF
From: Peter Zijlstra
Date: Fri Apr 05 2024 - 07:04:12 EST
Hi all,
I'm slowly crawling back out of the hole and trying to get back to work.
Availability is still low on my end, but I'll try and respond to some email.
Anyway, in order to get my feet wet again with sitting behind a computer, find
here a few patches that should functionally complete the EEVDF journey.
This very much includes the new interface that exposes the extra parameter that
EEVDF has. I've chosen to use sched_attr::sched_runtime for this over a
nice-like value because some workloads actually know their slice length (can be
dynamically measured in the same way as for deadline using
CLOCK_THREAD_CPUTIME_ID) and using the real request size is much more effective
than some relative measure.
[[ using too short a request size will increase job preemption overhead,
using too long a request size will decrease timeliness ]]
The whole delayed-dequeue thing is I think a fundamental thing that was missing
from the EEVDF paper. Without something like this EEVDF will simply not work
right. IIRC this was mentioned to me many years ago when people worked on BFQ
iosched and ran into this same issue. Time had erased the critical aspect of
this note and I had to re-discover it again.
Also, I think Ben expressed concern that preserving lag over long periods
doesn't make sense a while back.
The implementation presented here is one that should work with our cgroup mess
and keeps most of the ugly inside fair.c unlike previous versions which puked
all over the core scheduler code.
Critically cfs-cgroup throttling is not tested, and cgroups are only tested in
so far that a systemd infected machine now boots (took a bit).
Other than that, it works well enough to build the next kernel and it passes
the few trivial latency-slice tests I ran.
Anyway, please have a poke and let me know...