Re: [PATCH RESEND sched_ext/for-6.12] sched_ext: Don't use double locking to migrate tasks across CPUs
From: Tejun Heo
Date: Thu Aug 08 2024 - 14:25:06 EST
Hello,
On Thu, Aug 08, 2024 at 01:19:27PM -0500, David Vernet wrote:
...
> > + deactivate_task(src_rq, p, 0);
> > + set_task_cpu(p, cpu_of(dst_rq));
> > + p->scx.sticky_cpu = cpu_of(dst_rq);
> > +
> > + raw_spin_rq_unlock(src_rq);
> > + raw_spin_rq_lock(dst_rq);
> >
> > /*
> > * We want to pass scx-specific enq_flags but activate_task() will
> > * truncate the upper 32 bit. As we own @rq, we can pass them through
> > * @rq->scx.extra_enq_flags instead.
> > */
> > - WARN_ON_ONCE(rq->scx.extra_enq_flags);
> > - rq->scx.extra_enq_flags = enq_flags;
> > - activate_task(rq, p, 0);
> > - rq->scx.extra_enq_flags = 0;
> > + WARN_ON_ONCE(!cpumask_test_cpu(cpu_of(dst_rq), p->cpus_ptr));
>
> Hmmm, what's to stop someone from issuing a call to
> set_cpus_allowed_ptr() after we drop the src_rq lock above? Before we
> held any relevant rq lock so everything should have been protected, but
> I'm not following how we prevent racing with the cpus_allowed logic in
> core.c here.
deactivate_task(src_rq, p, 0) sets p->on_rq to TASK_ON_RQ_MIGRATING and
task_rq_lock() can't lock p until it's cleared, so set_cpus_allowed_ptr()
is blocked till the migration is complete.
Thanks.
--
tejun