Re: [PATCH v2] sched: fix first task of a task group is attached twice

From: Vincent Guittot
Date: Fri May 27 2016 - 13:17:17 EST


On 27 May 2016 at 17:48, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> On 25/05/16 16:01, Vincent Guittot wrote:
>> The cfs_rq->avg.last_update_time is initialize to 0 with the main effect
>> that the 1st sched_entity that will be attached, will keep its
>> last_update_time set to 0 and will attached once again during the
>> enqueue.
>> Initialize cfs_rq->avg.last_update_time to 1 instead.
>>
>> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>> ---
>>
>> v2:
>> - rq_clock_task(rq_of(cfs_rq)) can't be used because lock is not held
>>
>> kernel/sched/fair.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 218f8e8..3724656 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -8586,6 +8586,14 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
>> se->depth = parent->depth + 1;
>> }
>>
>> + /*
>> + * Set last_update_time to something different from 0 to make
>> + * sure the 1st sched_entity will not be attached twice: once
>> + * when attaching the task to the group and one more time when
>> + * enqueueing the task.
>> + */
>> + tg->cfs_rq[cpu]->avg.last_update_time = 1;
>> +
>> se->my_q = cfs_rq;
>> /* guarantee group entities always have weight */
>> update_load_set(&se->load, NICE_0_LOAD);
>
> So why not setting the last_update_time value for those cfs_rq's when
> we have the lock? E.g. in task_move_group_fair() or attach_task_cfs_rq().

I'm not sure that it's worth adding this init in functions that are
then used often only for the init of it.
If you are concerned by the update of the load of the 1st task that
will be attached, it can still have elapsed a long time between the
creation of the group and the 1st enqueue of a task. This was the case
for the test i did when i found this issue.

Beside this point, I have to send a new version to set
load_last_update_time_copy for not 64 bits system. Fengguang points me
the issue

Regards,
Vincent

>
> @@ -8490,12 +8493,20 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
> #ifdef CONFIG_FAIR_GROUP_SCHED
> static void task_move_group_fair(struct task_struct *p)
> {
> +#ifdef CONFIG_SMP
> + struct cfs_rq *cfs_rq = NULL;
> +#endif
> +
> detach_task_cfs_rq(p);
> set_task_rq(p, task_cpu(p));
>
> #ifdef CONFIG_SMP
> /* Tell se's cfs_rq has been changed -- migrated */
> p->se.avg.last_update_time = 0;
> +
> + cfs_rq = cfs_rq_of(&p->se);
> + if (!cfs_rq->avg.last_update_time)
> + cfs_rq->avg.last_update_time = rq_clock_task(rq_of(cfs_rq));
> #endif
>
> or
>
> @@ -8423,6 +8423,9 @@ static void attach_task_cfs_rq(struct task_struct *p)
> se->depth = se->parent ? se->parent->depth + 1 : 0;
> #endif
>
> + if (!cfs_rq->avg.last_update_time)
> + cfs_rq->avg.last_update_time = rq_clock_task(rq_of(cfs_rq));
> +
> /* Synchronize task with its cfs_rq */
> attach_entity_load_avg(cfs_rq, se);