Re: [RFC PATCH v2 2/7] sched/fair: Handle throttle path for task based throttle

From: K Prateek Nayak
Date: Mon Apr 14 2025 - 11:12:05 EST


Hello Florian,

On 4/14/2025 8:09 PM, Florian Bezdeka wrote:
On Wed, 2025-04-09 at 20:07 +0800, Aaron Lu wrote:
@@ -8888,6 +8884,9 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
  goto idle;
  se = &p->se;
+ if (throttled_hierarchy(cfs_rq_of(se)))
+ task_throttle_setup_work(p);
+
 #ifdef CONFIG_FAIR_GROUP_SCHED
  if (prev->sched_class != &fair_sched_class)
  goto simple;

For testing purposes I would like to backport that to 6.1-stable. The
situation around pick_next_task_fair() seems to have changed meanwhile:

- it moved out of the CONFIG_SMP guard
- Completely different implementation

Backporting to 6.12 looks doable, but 6.6 and below looks challenging

v6.6 introduced the EEVDF algorithm that changes a fair bit of
fair.c but the bandwidth control bits are mostly same and they all
get ripped out in Patch 2 and Patch 3.

at first glance. Do you have any insights that could help backporting,
especially for this hunk, but maybe even in general?

For the particular hunk, on v6.5, you can do:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b3e25be58e2b..2a8d9f19d0db 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8173,6 +8173,11 @@ done: __maybe_unused;
update_misfit_status(p, rq);
+#ifdef CONFIG_CFS_BANDWIDTH
+ if (throttled_hierarchy(cfs_rq_of(&p->se)))
+ task_throttle_setup_work(p);
+#endif
+
return p;
idle:
--

Add task work just before you return "p" after the "done" label.

For most part, this should be easily portable since the bandwidth
control mechanism hasn't seen much changes except for the async
throttling and few bits around throttled time accounting. Also, you can
drop all the bits that refer "delayed" of "DEQUEUE_DELAYED" since those
are EEVDF specific (Patch 6 can be fully dropped on versions < v6.6).


Best regards,
Florian

--
Thanks and Regards,
Prateek