Re: [PATCH 31/31] sched_ext: Add a rust userspace hybrid example scheduler

From: Peter Zijlstra
Date: Tue Dec 13 2022 - 06:37:51 EST


On Mon, Dec 12, 2022 at 02:18:59PM -0800, Josh Don wrote:

> > But really, having seen some of this I long for the UMCG patches -- that
> > at least was somewhat sane and trivially composes, unlike all this
> > madness.
>
> I wasn't sure if you were focusing specifically on how the BPF portion
> is implemented, or on UMCG vs sched_ext. For the latter,

The latter, from where I'm sitting UMCG looks a *TON* saner than this
BPF scheduler proposal. In fact, I'm >< close to just saying NAK to the
whole thing and ignoring it henceforth, there's too many problems with
the whole approach.

( Many were already noted by Linus when he NAK'ed loadable schedulers
previously. )

> and ignoring
> the specifics of this example, the UMCG and sched_ext work are
> complementary, but not mutually exclusive. UMCG is about driving
> cooperative scheduling within a particular application. UMCG does not
> have control over or react to external preemption,

It can control preemption inside the process, and if you have the degree
of control you need to make the whole BPF thing work, you also have the
degree of control to ensure you only run the one server task on a CPU
and all that no longer matters because there's only the process and you
control preemption inside that.

> nor does it make thread placement decisions.

It can do that just fine -- inside the process. UMCG has full control
over which server task a worker task is associated with, then run a
single server task per CPU and have them pinned and you get full
placement control.

> sched_ext is considering things more at
> the system level: arbitrating fairness and preemption between
> processes, deciding when and where threads run, etc., and also being
> able to take application-specific hints if desired.

sched_ext does fundamentally not compose, you cannot run two different
schedulers for two different application stacks that happen to co-reside
on the same machine.

While with UMCG that comes naturally.

sched_ext also sits at the very bottom of the class stack (it more or
less has to) the result is that in order to use it at all, you have to
have control over all runnable tasks in the system (a stray CFS task
would interfere quite disastrously) but that is exactly the same
constraint you need to make UMCG work.

Conversely, it is very hard to use the BPF thing to do what UMCG can do.
Using UMCG I can have a SCHED_DEADLINE server implement a task based
pipeline schedule (something that's fairly common and really hard to
pull off with just SCHED_DEADLINE itself).

Additionally, UMCG naturally works with things like Proxy Execution,
seeing how the server task *is* a proxy for the current active worker
task.