Re: [PATCH v2] sched/fair: sanitize vruntime of entity being migrated

From: Peter Zijlstra
Date: Tue Mar 14 2023 - 13:16:49 EST


On Tue, Mar 14, 2023 at 02:24:37PM +0100, Vincent Guittot wrote:

> > @@ -7632,11 +7646,8 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
> > * min_vruntime -- the latter is done by enqueue_entity() when placing
> > * the task on the new runqueue.
> > */
> > - if (READ_ONCE(p->__state) == TASK_WAKING) {
> > - struct cfs_rq *cfs_rq = cfs_rq_of(se);
> > -
> > + if (READ_ONCE(p->__state) == TASK_WAKING || reset_vruntime(cfs_rq, se))
>
> That's somehow what was proposed in one of the previous proposals but
> we can't call rq_clock_task(rq_of(cfs_rq)) because rq lock might not
> be hold and rq task clock has not been updated before being used

Argh indeed. I spend a lot of time ensuring we didn't take the old rq
lock on wakeup -- and then a lot of time cursing about how we don't :-)

Now, if we could rely on the rq-clock being no more than 1 tick behind
current, this would still be entirely sufficient to catch the long sleep
case.

Except I suppose that NOHZ can bite us here. If the old CPU is idle, the
timestamps can be arbitrarily old. Mooo :/