Re: [GIT PULL] sched_ext: Initial pull request for v6.11

From: Tejun Heo
Date: Tue Jul 30 2024 - 21:22:18 EST


Hello,

On Thu, Jul 25, 2024 at 02:19:07AM +0100, Qais Yousef wrote:
> We really shouldn't change how schedutil works. The governor is supposed to
> behave in a certain way, and we need to ensure consistency. I think you should
> look on how you make your scheduler compatible with it. Adding hooks to say
> apply this perf value that I want is a recipe for randomness.

You made the same point in another thread, so let's discuss it there but
it's not changing the relationship between schedutil and sched class.
schedutil collects utility signals from sched classes and then translates
that to cpufreq operations. For SCX scheds, the only way to get such util
signals is asking the BPF scheduler. Nobody else knows. It's loading a
completely new scheduler after all.

> Generally I do have big concerns about sched_ext being loaded causing spurious
> bug report as it changes the behavior of the scheduler and the kernel is not
> trusted when sched_ext scheduler is loaded. Like out-of-tree modules, it should
> cause the kernel to be tainted. Something I asked for few years back when
> Gushchin sent the first proposal
>
> How can we trust bug and regression report when out-of-tree code was loaded
> that intrusively changes the way the kernel behaves? This must be marked as
> a kernel TAINT otherwise we're doomed trying to fix out of tree code.

You raised in the other thread too but I don't think taint fits the bill
here. Taints are useful when the impact is persistent so that we can know
that a later failure may have been caused by an earlier thing which might
not be around anymore. A SCX scheduler is not supposed to leave any
persistent impact on the system. If it's loaded, we can see it's loaded in
oops dumps and other places. If it's not, it shouldn't really be factor.

> And there's another general problem of regression reports due to failure to
> load code due to changes to how the scheduler evolves. We need to continue to
> be able to change our code freely without worrying about breaking out-of-tree
> code. What is the regression rule? We don't want to be limited to be able to
> make in-kernel changes because out-of-tree code will fail now; either to load
> or to run as intended. How is the current code designed to handle failsafe when
> the external scheduler is no longer compatible with existing kernel and *they*
> need to rewrite their code, pretty much the way it goes for out-of-tree modules
> now?

It's the same as other BPF hooks. We don't want to break willy-nilly but we
can definitely break backward compatibility if necessary. This has been
discussed to death and I don't think we can add much by litigating the case
again.

Thanks.

--
tejun