[PATCH] sched/fair: Untangle NEXT_BUDDY and pick_next_task()
From: Peter Zijlstra
Date: Fri Nov 29 2024 - 05:16:01 EST
On Fri, Nov 29, 2024 at 10:55:00AM +0100, Peter Zijlstra wrote:
> Anyway.. I'm sure I started a patch series cleaning up the whole next
> buddy thing months ago (there's more problems here), but I can't seem to
> find it in a hurry :/
There was this..
---
Subject: sched/fair: Untangle NEXT_BUDDY and pick_next_task()
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Fri Nov 29 10:36:59 CET 2024
There are 3 sites using set_next_buddy() and only one is conditional
on NEXT_BUDDY, the other two sites are unconditional; to note:
- yield_to_task()
- cgroup dequeue / pick optimization
However, having NEXT_BUDDY control both the wakeup-preemption and the
picking side of things means its near useless.
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/sched/fair.c | 4 ++--
kernel/sched/features.h | 9 +++++++++
2 files changed, 11 insertions(+), 2 deletions(-)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5613,9 +5613,9 @@ static struct sched_entity *
pick_next_entity(struct rq *rq, struct cfs_rq *cfs_rq)
{
/*
- * Enabling NEXT_BUDDY will affect latency but not fairness.
+ * Picking the ->next buddy will affect latency but not fairness.
*/
- if (sched_feat(NEXT_BUDDY) &&
+ if (sched_feat(PICK_BUDDY) &&
cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next)) {
/* ->next will never be delayed */
SCHED_WARN_ON(cfs_rq->next->sched_delayed);
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -32,6 +32,15 @@ SCHED_FEAT(PREEMPT_SHORT, true)
SCHED_FEAT(NEXT_BUDDY, false)
/*
+ * Allow completely ignoring cfs_rq->next; which can be set from various
+ * places:
+ * - NEXT_BUDDY (wakeup preemption)
+ * - yield_to_task()
+ * - cgroup dequeue / pick
+ */
+SCHED_FEAT(PICK_BUDDY, true)
+
+/*
* Consider buddies to be cache hot, decreases the likeliness of a
* cache buddy being migrated away, increases cache locality.
*/
>
>