On May 25, 2024, at 19:48, Honglei Wang <jameshongleiwang@xxxxxxx> wrote:
On 2024/5/24 21:40, Chunxin Zang wrote:
I found that some tasks have been running for a long enough time andHi Chunxin,
have become illegal, but they are still not releasing the CPU. This
will increase the scheduling delay of other processes. Therefore, I
tried checking the current process in wakeup_preempt and entity_tick,
and if it is illegal, reschedule that cfs queue.
The modification can reduce the scheduling delay by about 30% when
RUN_TO_PARITY is enabled.
So far, it has been running well in my test environment, and I have
pasted some test results below.
I isolated four cores for testing. I ran Hackbench in the background
and observed the test results of cyclictest.
hackbench -g 4 -l 100000000 &
cyclictest --mlockall -D 5m -q
EEVDF PATCH EEVDF-NO_PARITY PATCH-NO_PARITY
# Min Latencies: 00006 00006 00006 00006
LNICE(-19) # Avg Latencies: 00191 00122 00089 00066
# Max Latencies: 15442 07648 14133 07713
# Min Latencies: 00006 00010 00006 00006
LNICE(0) # Avg Latencies: 00466 00277 00289 00257
# Max Latencies: 38917 32391 32665 17710
# Min Latencies: 00019 00053 00010 00013
LNICE(19) # Avg Latencies: 37151 31045 18293 23035
# Max Latencies: 2688299 7031295 426196 425708
I'm actually a bit hesitant about placing this modification under the
NO_PARITY feature. This is because the modification conflicts with the
semantics of RUN_TO_PARITY. So, I captured and compared the number of
resched occurrences in wakeup_preempt to see if it introduced any
additional overhead.
Similarly, hackbench is used to stress the utilization of four cores to
100%, and the method for capturing the number of PREEMPT occurrences is
referenced from [1].
schedstats EEVDF PATCH EEVDF-NO_PARITY PATCH-NO_PARITY CFS(6.5)
stats.check_preempt_count 5053054 5057286 5003806 5018589 5031908
stats.patch_cause_preempt_count ------- 858044 ------- 765726 -------
stats.need_preempt_count 570520 858684 3380513 3426977 1140821
From the above test results, there is a slight increase in the number of
resched occurrences in wakeup_preempt. However, the results vary with each
test, and sometimes the difference is not that significant. But overall,
the count of reschedules remains lower than that of CFS and is much less
than that of NO_PARITY.
[1]: https://lore.kernel.org/all/20230816134059.GC982867@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m52057282ceb6203318be1ce9f835363de3bef5cb
Signed-off-by: Chunxin Zang <zangchunxin@xxxxxxxxxxx>
Reviewed-by: Chen Yang <yangchen11@xxxxxxxxxxx>
---
kernel/sched/fair.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 03be0d1330a6..a0005d240db5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5523,6 +5523,9 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
hrtimer_active(&rq_of(cfs_rq)->hrtick_timer))
return;
#endif
+
+ if (!entity_eligible(cfs_rq, curr))
+ resched_curr(rq_of(cfs_rq));
}
@@ -8325,6 +8328,9 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
if (unlikely(p->policy != SCHED_NORMAL) || !sched_feat(WAKEUP_PREEMPTION))
return;
+ if (!entity_eligible(cfs_rq, se))
+ goto preempt;
+
find_matching_se(&se, &pse);
WARN_ON_ONCE(!pse);
Did you run a comparative test to see which modification is more helpful on improve the latency? Modification at tick point makes more sense to me. But, seems just resched arbitrarily in wakeup might introduce too much preemption (and maybe more context switch?) in complex environment such as cgroup hierarchy.
Thanks,
Honglei
Hi Honglei
I attempted to build a slightly more complex scenario. It consists of 4 isolated cores,
4 groups of hackbench (160 processes in total) to stress the CPU, and 1 cyclictest
process to test scheduling latency. Using cgroup v2, to created 64 cgroup leaf nodes
in a binary tree structure (with a depth of 7). I then evenly distributed the aforementioned
161 processes across the 64 cgroups respectively, and observed the scheduling delay
performance of cyclictest.
Unfortunately, the test results were very fluctuating, and the two sets of data were very
close to each other. I suspect that it might be due to too few processes being distributed
in each cgroup, which led to the logic for determining ineligible always succeeding and
following the original logic. Later, I will attempt more tests to verify the impact of these
modifications in scenarios involving multiple cgroups.
thanks
Chunxin