Re: [RFC][PATCH 03/10] sched/fair: Cleanup pick_task_fair() vs throttle

From: Benjamin Segall
Date: Fri Apr 05 2024 - 17:12:08 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> Per 54d27365cae8 ("sched/fair: Prevent throttling in early
> pick_next_task_fair()") the reason check_cfs_rq_runtime() is under the
> 'if (curr)' check is to ensure the (downward) traversal does not
> result in an empty cfs_rq.
>
> But then the pick_task_fair() 'copy' of all this made it restart the
> traversal anyway, so that seems to solve the issue too.

Yeah, putting the check_cfs_rq_runtime inside of that condition was
specific to the exact pnt_fair code, and the specific places that
nr_running was and was not checked. pick_task_fair doesn't care about
any of that, and if instead put_prev manages to throttle the picked
task, we can still successfully switch to it for a moment.

Reviewed-by: Ben Segall <bsegall@xxxxxxxxxx>

>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> kernel/sched/fair.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8435,11 +8435,11 @@ static struct task_struct *pick_task_fai
> update_curr(cfs_rq);
> else
> curr = NULL;
> -
> - if (unlikely(check_cfs_rq_runtime(cfs_rq)))
> - goto again;
> }
>
> + if (unlikely(check_cfs_rq_runtime(cfs_rq)))
> + goto again;
> +
> se = pick_next_entity(cfs_rq);
> cfs_rq = group_cfs_rq(se);
> } while (cfs_rq);