[RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path

From: Aaron Lu
Date: Thu Mar 13 2025 - 03:23:58 EST


It's possible unthrottle_cfs_rq() is called with !runtime_remaining
due to things like user changed quota setting(see tg_set_cfs_bandwidth())
or async unthrottled us with a positive runtime_remaining but other still
running entities consumed those runtime before we reach there.

Anyway, we can't unthrottle this cfs_rq without any runtime remaining
because task enqueue during unthrottle can immediately trigger a throttle
by check_enqueue_throttle(), which should never happen.

Signed-off-by: Aaron Lu <ziqianlu@xxxxxxxxxxxxx>
---
kernel/sched/fair.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index be96f7d32998c..d646451d617c1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6058,6 +6058,19 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];

+ /*
+ * It's possible we are called with !runtime_remaining due to things
+ * like user changed quota setting(see tg_set_cfs_bandwidth()) or async
+ * unthrottled us with a positive runtime_remaining but other still
+ * running entities consumed those runtime before we reach here.
+ *
+ * Anyway, we can't unthrottle this cfs_rq without any runtime remaining
+ * because any enqueue below will immediately trigger a throttle, which
+ * is not supposed to happen on unthrottle path.
+ */
+ if (cfs_rq->runtime_enabled && !cfs_rq->runtime_remaining)
+ return;
+
cfs_rq->throttled = 0;

update_rq_clock(rq);
--
2.39.5