Re: [PATCH 2/2] sched_ext: Add cpuperf support

From: Hongyan Xia
Date: Tue Jul 02 2024 - 13:12:15 EST


On 02/07/2024 17:37, Tejun Heo wrote:
Hello, Hongyan.

On Tue, Jul 02, 2024 at 11:23:58AM +0100, Hongyan Xia wrote:
What would be really nice is to have cpufreq support in sched_ext but not
force uclamp_enabled. But, I also think there will be people who are happy
with the current uclamp implementation and want to just reuse it. The best
thing is to let the loaded scheduler decide, somehow, which I don't know if
there's an easy way to do this yet.

I don't know much about uclamp but at least from sched_ext side, it's
trivial add an ops flag for it and because we know that no tasks are on the
ext class before BPF scheduler is loaded, as long as we switch the
uclamp_enabled value while the BPF scheduler is not loaded, the uclamp
buckets should stay balanced. AFAICS, the only core change we need to make
is mooving the uclamp_enabled bool outside sched_class so that it can be
changed runtime. Is that the case or am I missing something?


Pretty much. Just to clarify what I meant, it would be fantastic if for ext, sched_class->uclamp_enabled is decided the moment we load the custom scheduler, not globally enabled all the time for all ext schedulers, in case the custom scheduler wants to ignore uclamp or has its own uclamp implementation. During ext_ops->init(), it would be great if the loaded scheduler could decide whether its sched_class->uclamp_enabled should be enabled.

However, sched_class->uclamp_enabled is just a normal struct variable, so I cannot immediately see a clean way to let the loaded scheduler program this field. We might be able to expose a function from the kernel side to write sched_class->uclamp_enabled during ext_ops->init(), although that looks a bit messy.