Re: [PATCH v2] sched/fair: Reschedule the cfs_rq when current is ineligible

From: Chen Yu
Date: Fri Jun 07 2024 - 01:07:56 EST

Next message: Vabhav Sharma: "RE: [PATCH 1/4] dt-bindings: firmware: secvio: Add device tree bindings"
Previous message: syzbot: "Re: [syzbot] [btrfs?] kernel BUG in clear_inode"
In reply to: kernel test robot: "Re: [PATCH v2] sched/fair: Reschedule the cfs_rq when current is ineligible"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2024-05-29 at 22:18:06 +0800, Chunxin Zang wrote:
> I found that some tasks have been running for a long enough time and
> have become illegal, but they are still not releasing the CPU. This
> will increase the scheduling delay of other processes. Therefore, I
> tried checking the current process in wakeup_preempt and entity_tick,
> and if it is illegal, reschedule that cfs queue.
>
> When RUN_TO_PARITY is enabled, its behavior essentially remains
> consistent with the original process. When NO_RUN_TO_PARITY is enabled,
> some additional preemptions will be introduced, but not too many.
>
> I have pasted some test results below.
> I isolated four cores for testing and ran hackbench in the background,
> and observed the test results of cyclictest.
>
> hackbench -g 4 -l 100000000 &
> cyclictest --mlockall -D 5m -q
>
> EEVDF PATCH EEVDF-NO_PARITY PATCH-NO_PARITY
>
> # Min Latencies: 00006 00006 00006 00006
> LNICE(-19) # Avg Latencies: 00191 00133 00089 00066
> # Max Latencies: 15442 08466 14133 07713
>
> # Min Latencies: 00006 00010 00006 00006
> LNICE(0) # Avg Latencies: 00466 00326 00289 00257
> # Max Latencies: 38917 13945 32665 17710
>
> # Min Latencies: 00019 00053 00010 00013
> LNICE(19) # Avg Latencies: 37151 25852 18293 23035
> # Max Latencies: 2688299 4643635 426196 425708
>
> I captured and compared the number of preempt occurrences in wakeup_preempt
> to see if it introduced any additional overhead.
>
> Similarly, hackbench is used to stress the utilization of four cores to
> 100%, and the method for capturing the number of PREEMPT occurrences is
> referenced from [1].
>
> schedstats EEVDF PATCH EEVDF-NO_PARITY PATCH-NO_PARITY CFS(6.5)
> .stats.check_preempt_count 5053054 5045388 5018589 5029585
> .stats.patch_preempt_count ------- 0020495 ------- 0700670 -------
> .stats.need_preempt_count 0570520 0458947 3380513 3116966 1140821
>
> From the above test results, there is a slight increase in the number of
> preempt occurrences in wakeup_preempt. However, the results vary with each
> test, and sometimes the difference is not that significant.
>
> [1]: https://lore.kernel.org/all/20230816134059.GC982867@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m52057282ceb6203318be1ce9f835363de3bef5cb
>
> Signed-off-by: Chunxin Zang <zangchunxin@xxxxxxxxxxx>
> Reviewed-by: Chen Yang <yangchen11@xxxxxxxxxxx>
>
> ------
> Changes in v2:
> - Make the logic that determines the current process as ineligible and
> triggers preemption effective only when NO_RUN_TO_PARITY is enabled.
> - Update the commit message
> ---
> kernel/sched/fair.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 03be0d1330a6..fa2c512139e5 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -745,6 +745,17 @@ int entity_eligible(struct cfs_rq *cfs_rq, struct sched_entity *se)
> return vruntime_eligible(cfs_rq, se->vruntime);
> }
>
> +static bool check_entity_need_preempt(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> + if (sched_feat(RUN_TO_PARITY) && se->vlag != se->deadline)
> + return true;

If I understand correctly, here it intends to check if the current se
has consumed its 1st slice after been picked at set_next_entity(), and if yes do a reschedule.
check_entity_need_preempt() is added at the end of entity_tick(), which could overwrite
the police to reschedule current: (entity_tick()->update_curr()->update_deadline()), only there
are more than 1 runnable tasks will the current be preempted, even if it has expired the 1st
requested slice.

> +
> + if (!sched_feat(RUN_TO_PARITY) && !entity_eligible(cfs_rq, se))
> + return true;
> +
> + return false;
> +}
> +
> static u64 __update_min_vruntime(struct cfs_rq *cfs_rq, u64 vruntime)
> {
> u64 min_vruntime = cfs_rq->min_vruntime;
> @@ -5523,6 +5534,9 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
> hrtimer_active(&rq_of(cfs_rq)->hrtick_timer))
> return;
> #endif
> +
> + if (check_entity_need_preempt(cfs_rq, curr))
> + resched_curr(rq_of(cfs_rq));
> }
>
>
> @@ -8343,6 +8357,9 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
> cfs_rq = cfs_rq_of(se);
> update_curr(cfs_rq);
>
> + if (check_entity_need_preempt(cfs_rq, se))
> + goto preempt;
> +

As we changes the preemption policy for current in two places, the tick preemption and wakeup preemption,
do you have statistics that shows which one brings the most benefit?

thanks,
Chenyu

Next message: Vabhav Sharma: "RE: [PATCH 1/4] dt-bindings: firmware: secvio: Add device tree bindings"
Previous message: syzbot: "Re: [syzbot] [btrfs?] kernel BUG in clear_inode"
In reply to: kernel test robot: "Re: [PATCH v2] sched/fair: Reschedule the cfs_rq when current is ineligible"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]