[PATCH v2] sched/fair: Reschedule the cfs_rq when current is ineligible

From: Chunxin Zang
Date: Wed May 29 2024 - 10:19:45 EST


I found that some tasks have been running for a long enough time and
have become illegal, but they are still not releasing the CPU. This
will increase the scheduling delay of other processes. Therefore, I
tried checking the current process in wakeup_preempt and entity_tick,
and if it is illegal, reschedule that cfs queue.

When RUN_TO_PARITY is enabled, its behavior essentially remains
consistent with the original process. When NO_RUN_TO_PARITY is enabled,
some additional preemptions will be introduced, but not too many.

I have pasted some test results below.
I isolated four cores for testing and ran hackbench in the background,
and observed the test results of cyclictest.

hackbench -g 4 -l 100000000 &
cyclictest --mlockall -D 5m -q

EEVDF PATCH EEVDF-NO_PARITY PATCH-NO_PARITY

# Min Latencies: 00006 00006 00006 00006
LNICE(-19) # Avg Latencies: 00191 00133 00089 00066
# Max Latencies: 15442 08466 14133 07713

# Min Latencies: 00006 00010 00006 00006
LNICE(0) # Avg Latencies: 00466 00326 00289 00257
# Max Latencies: 38917 13945 32665 17710

# Min Latencies: 00019 00053 00010 00013
LNICE(19) # Avg Latencies: 37151 25852 18293 23035
# Max Latencies: 2688299 4643635 426196 425708

I captured and compared the number of preempt occurrences in wakeup_preempt
to see if it introduced any additional overhead.

Similarly, hackbench is used to stress the utilization of four cores to
100%, and the method for capturing the number of PREEMPT occurrences is
referenced from [1].

schedstats EEVDF PATCH EEVDF-NO_PARITY PATCH-NO_PARITY CFS(6.5)
stats.check_preempt_count 5053054 5045388 5018589 5029585
stats.patch_preempt_count ------- 0020495 ------- 0700670 -------
stats.need_preempt_count 0570520 0458947 3380513 3116966 1140821

>From the above test results, there is a slight increase in the number of
preempt occurrences in wakeup_preempt. However, the results vary with each
test, and sometimes the difference is not that significant.

[1]: https://lore.kernel.org/all/20230816134059.GC982867@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m52057282ceb6203318be1ce9f835363de3bef5cb

Signed-off-by: Chunxin Zang <zangchunxin@xxxxxxxxxxx>
Reviewed-by: Chen Yang <yangchen11@xxxxxxxxxxx>

------
Changes in v2:
- Make the logic that determines the current process as ineligible and
triggers preemption effective only when NO_RUN_TO_PARITY is enabled.
- Update the commit message
---
kernel/sched/fair.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 03be0d1330a6..fa2c512139e5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -745,6 +745,17 @@ int entity_eligible(struct cfs_rq *cfs_rq, struct sched_entity *se)
return vruntime_eligible(cfs_rq, se->vruntime);
}

+static bool check_entity_need_preempt(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+ if (sched_feat(RUN_TO_PARITY) && se->vlag != se->deadline)
+ return true;
+
+ if (!sched_feat(RUN_TO_PARITY) && !entity_eligible(cfs_rq, se))
+ return true;
+
+ return false;
+}
+
static u64 __update_min_vruntime(struct cfs_rq *cfs_rq, u64 vruntime)
{
u64 min_vruntime = cfs_rq->min_vruntime;
@@ -5523,6 +5534,9 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
hrtimer_active(&rq_of(cfs_rq)->hrtick_timer))
return;
#endif
+
+ if (check_entity_need_preempt(cfs_rq, curr))
+ resched_curr(rq_of(cfs_rq));
}


@@ -8343,6 +8357,9 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
cfs_rq = cfs_rq_of(se);
update_curr(cfs_rq);

+ if (check_entity_need_preempt(cfs_rq, se))
+ goto preempt;
+
/*
* XXX pick_eevdf(cfs_rq) != se ?
*/
--
2.34.1