[PATCH v2] sched: fix potential use-after-free with cfs bandwidth
From: Josh Don
Date: Thu Feb 20 2025 - 20:23:47 EST
We remove the cfs_rq throttled_csd_list entry *before* doing the
unthrottle. The problem with that is that destroy_cfs_bandwidth() does a
lockless scan of the system for any non-empty CSD lists. As a result,
it is possible that destroy_cfs_bandwidth() returns while we still have
a cfs_rq from the task group about to be unthrottled.
For full correctness, we should avoid removal from the list until after
we're done unthrottling in __cfsb_csd_unthrottle().
For consistency, we make the same change to distribute_cfs_runtime(),
even though this should already be safe due to destroy_cfs_bandwidth()
cancelling the bandwidth hrtimers.
Fixes: 8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth")
Signed-off-by: Josh Don <joshdon@xxxxxxxxxx>
Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
Reviewed-by: Chengming Zhou <chengming.zhou@xxxxxxxxx>
---
v2: updated commit message with additional metadata
kernel/sched/fair.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 34fe6e9490c2..78f542ab03cf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5917,10 +5917,10 @@ static void __cfsb_csd_unthrottle(void *arg)
list_for_each_entry_safe(cursor, tmp, &rq->cfsb_csd_list,
throttled_csd_list) {
- list_del_init(&cursor->throttled_csd_list);
-
if (cfs_rq_throttled(cursor))
unthrottle_cfs_rq(cursor);
+
+ list_del_init(&cursor->throttled_csd_list);
}
rcu_read_unlock();
@@ -6034,11 +6034,11 @@ static bool distribute_cfs_runtime(struct cfs_bandwidth *cfs_b)
rq_lock_irqsave(rq, &rf);
- list_del_init(&cfs_rq->throttled_csd_list);
-
if (cfs_rq_throttled(cfs_rq))
unthrottle_cfs_rq(cfs_rq);
+ list_del_init(&cfs_rq->throttled_csd_list);
+
rq_unlock_irqrestore(rq, &rf);
}
SCHED_WARN_ON(!list_empty(&local_unthrottle));
--
2.48.1.658.g4767266eb4-goog