Re: [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case
From: Vincent Guittot
Date: Tue Aug 23 2022 - 03:06:53 EST
On Thu, 18 Aug 2022 at 14:48, Chengming Zhou
<zhouchengming@xxxxxxxxxxxxx> wrote:
>
> commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
> fixed two load tracking problems for new task, including detach on
> unattached new task problem.
>
> There still left another detach on unattached task problem for the task
> which has been woken up by try_to_wake_up() and waiting for actually
> being woken up by sched_ttwu_pending().
>
> try_to_wake_up(p)
> cpu = select_task_rq(p)
> if (task_cpu(p) != cpu)
> set_task_cpu(p, cpu)
> migrate_task_rq_fair()
> remove_entity_load_avg() --> unattached
> se->avg.last_update_time = 0;
> __set_task_cpu()
> ttwu_queue(p, cpu)
> ttwu_queue_wakelist()
> __ttwu_queue_wakelist()
>
> task_change_group_fair()
> detach_task_cfs_rq()
> detach_entity_cfs_rq()
> detach_entity_load_avg() --> detach on unattached task
> set_task_rq()
> attach_task_cfs_rq()
> attach_entity_cfs_rq()
> attach_entity_load_avg()
>
> The reason of this problem is similar, we should check in detach_entity_cfs_rq()
> that se->avg.last_update_time != 0, before do detach_entity_load_avg().
>
> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1eb3fb3d95c3..eba8a64f905a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11721,6 +11721,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
> {
> struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> +#ifdef CONFIG_SMP
> + /*
> + * In case the task sched_avg hasn't been attached:
> + * - A forked task which hasn't been woken up by wake_up_new_task().
> + * - A task which has been woken up by try_to_wake_up() but is
> + * waiting for actually being woken up by sched_ttwu_pending().
> + */
> + if (!se->avg.last_update_time)
> + return;
> +#endif
> +
> /* Catch up with the cfs_rq and remove our load when we leave */
> update_load_avg(cfs_rq, se, 0);
> detach_entity_load_avg(cfs_rq, se);
> --
> 2.37.2
>