Re: [PATCH 31/31] sched_ext: Add a rust userspace hybrid example scheduler

From: Tejun Heo
Date: Tue Dec 13 2022 - 15:33:52 EST


Hello,

On Tue, Dec 13, 2022 at 12:30:06PM +0100, Peter Zijlstra wrote:
> ( Many were already noted by Linus when he NAK'ed loadable schedulers
> previously. )

Yeah, many of the points Linus raised still stand. However, that was 15
years ago and the situation including hardware reality has changed a lot. As
stated in the cover letter, that makes us (and others) want to try out
various ideas but the barrier has often been too high to do so at any scale,
which BPF drasically improves. Given those, I think it'd be worthwhile to
revisit that discussion.

> sched_ext also sits at the very bottom of the class stack (it more or
> less has to) the result is that in order to use it at all, you have to
> have control over all runnable tasks in the system (a stray CFS task

> would interfere quite disastrously) but that is exactly the same
> constraint you need to make UMCG work.

One important distinction is that it's a lot easier to have control at the
system level than at the application code level. Even for us with pretty
good control over what runs in the fleet, it'd be practically impossible to
effect that level of application change across the board. The situation is
further complicated with containers which can be pretty opaque to the
system. I have a hard time seeing co-operative application-driven scheduling
working among mutiple applications across the whole system. If we get to
non-fleet use-cases, it becomes even worse as you don't have enough resource
on or control over the code base you're running.

There may be some overlapping areas between SCX and UMCG but they're very
different things. After all, we can't let go of system level scheduling
because some applications have better control over their own sequencing.

As for the CFS starvation issue, I obviously don't find the currently
proposed behavior too bad - CFS is always the default scheduler and we fall
back to it whenever the BPF scheduling isn't working out whether that's
outright bugs in the BPF scheduler implementation or starvation through CFS.
That said, this comes down to what kind of behavior we wanna show to
userspace and we can implement whatever is appropriate and acceptable.

Thanks.

--
tejun