sched_ext: Partial mode priority and fallthrough to EEVDF

From: Matt Fleming

Date: Tue Mar 10 2026 - 11:25:40 EST


Hi,

At Cloudflare we're experimenting with inverting the priority of the
ext_sched_class and fair_sched_class to allow us to pick SCHED_EXT
tasks to run before SCHED_NORMAL. This gives us better scheduling
decisions for those SCHED_EXT tasks where we can embed business logic
into the BPF program and prevents them being starved by the larger
number of SCHED_NORMAL tasks under CPU contention. There are a couple
of reasons we took this route:

1. Our workloads are heterogeneous and complex and we can't move entire
systems to SCHED_EXT in one shot. We want to experiment with running
SCHED_EXT in partial mode as we progressively onboard more and more
services (we run multiple services on single machines).

2. There's no way today (AFAIK) to run in "full-mode" and have BPF
schedulers fallthrough to EEVDF.

In an ideal world, 2 is what we'd want to do. Is anyone else interested
in this problem or currently working on it? Is there anything coming in
the future that would make it easier for those of us slowly
transitioning to SCHED_EXT?

Thanks,
Matt