Re: [PATCH v3 17/20] sched/core: Introduce default arch handling code for inc/dec preferred CPUs

From: Shrikanth Hegde

Date: Tue Jun 09 2026 - 02:45:34 EST




On 6/9/26 12:10 AM, Ilya Leoshkevich wrote:


On 5/14/26 17:22, Shrikanth Hegde wrote:
Define default handlers for high/low steal time. If arch has better
decision logic, may override the default implementation.

- If the steal time higher than threshold, reduce the number of preferred
   CPUs by 1 core. The last core in the intersection of online and
   preferred CPUs will be marked as non-preferred.
   Ensure at least one core is left as preferred always.

- If the steal time lower than threshold, increase the number of preferred
   CPUs by 1 core. First online core which is not in cpu_preferred_mask will
   be marked as preferred.
   If all cores are already set to preferred, bail out.

Increase/Decrease may need to modify the splicing across NUMA nodes. It is
being kept simple for now.

Signed-off-by: Shrikanth Hegde <sshegde@xxxxxxxxxxxxx>
---
  include/linux/sched.h |  2 ++
  kernel/sched/core.c   | 58 +++++++++++++++++++++++++++++++++++++++++++
  2 files changed, 60 insertions(+)

[...]

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a3f65e9c7d30..195e3648b1b5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -11368,6 +11368,64 @@ void sched_init_steal_monitor(void)
      steal_mon.sampling_period_ms  = 1000;        /* once per second */
  }
+/*
+ * Default implementation of decrementing the preferred CPUs based on steal
+ * time. This is simple logic and decrease the preferred CPUs by 1 core.
+ * It takes out the last core in the online & preferred.
+ *
+ * Ensure at least one housekeeping core is always kept as preferred
+ *
+ * Could be overwritten by arch specific handling.
+ */
+#ifndef arch_dec_preferred_cpus
+void arch_dec_preferred_cpus(struct steal_monitor_t *sm, u64 steal_ratio)
+{
+    int last_cpu, tmp_cpu;
+    int this_cpu = raw_smp_processor_id();
+
+    cpumask_and(sm->tmp_mask, cpu_online_mask, cpu_preferred_mask);

Since preferred is always a subset of online, do we even need this?
Dropping tmp_mask will also make the error handling concern I posted
earlier go away.


You are right. Decrement doesn't need this. Thanks for catching this.

I ran through the cases mentioned here and some more. I don't see a good reason
to have the tmp_mask.
https://lore.kernel.org/all/0d8412de-e18a-476f-9eb6-9a977f4474a3@xxxxxxxxxxxxx/