Re: [PATCH v2] sched/fair: sanitize vruntime of entity being migrated

From: Dietmar Eggemann
Date: Tue Mar 14 2023 - 09:34:46 EST


On 14/03/2023 13:07, Peter Zijlstra wrote:
> On Tue, Mar 14, 2023 at 08:41:30AM +0100, Vincent Guittot wrote:
>
>> I'm going to use something a bit different from your proposal below by
>> merging initial and flag
>> static void place_entity(struct cfs_rq *cfs_rq, struct sched_entity
>> *se, int flags)
>>
>> with flags:
>> 0 for initial placement
>> ENQUEUE_WAKEUP for wakeup
>> ENQUEUE_MIGRATED for migrated task
>
> So when a task is not running for a long time (our case at hand), then
> there's two cases:
>
> - it wakes up locally and place_entity() gets to reset vruntime;
> - it wakes up remotely and migrate_task_rq_fair() can reset vruntime.
>
> So if we can rely on ENQUEUE_MIGRATED to differentiate between these
> cases, when wouldn't something like this work?

I guess so. We would avoid rq_clock_task skews or to be forced to pass
state that migrating se's vruntime is too old.

[...]

> @@ -7632,11 +7646,8 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
> * min_vruntime -- the latter is done by enqueue_entity() when placing
> * the task on the new runqueue.
> */
> - if (READ_ONCE(p->__state) == TASK_WAKING) {
> - struct cfs_rq *cfs_rq = cfs_rq_of(se);
> -
> + if (READ_ONCE(p->__state) == TASK_WAKING || reset_vruntime(cfs_rq, se))

Don't you want to call reset_vruntime() specifically on the waking task?

> se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
> - }
>
> if (!task_on_rq_migrating(p)) {
> remove_entity_load_avg(se);