Re: [PATCH v2 5/6] sched/rt: Better manage pushing unfit tasks on wakeup
From: Qais Yousef
Date: Mon Feb 24 2020 - 12:41:48 EST
On 02/24/20 21:34, Pavan Kondeti wrote:
> Hi Qais,
>
> On Mon, Feb 24, 2020 at 5:42 PM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> [...]
> > We could do, temporarily, to get these fixes into 5.6. But I do think
> > select_task_rq_rt() doesn't do a good enough job into pushing unfit tasks to
> > the right CPUs.
> >
> > I don't understand the reasons behind your objection. It seems you think that
> > select_task_rq_rt() should be enough, but not AFAICS. Can you be a bit more
> > detailed please?
> >
> > FWIW, here's a screenshot of what I see
> >
> > https://imgur.com/a/peV27nE
> >
> > After the first activation, select_task_rq_rt() fails to find the right CPU
> > (due to the same move all tasks to the cpumask_fist()) - but when the task
> > wakes up on 4, the logic I put causes it to migrate to CPU2, which is the 2nd
> > big core. CPU1 and CPU2 are the big cores on Juno.
> >
> > Now maybe we should fix select_task_rq_rt() to better balance tasks, but not
> > sure how easy is that.
> >
>
> Thanks for the trace. Now things are clear to me. Two RT tasks woke up
> simultaneously and the first task got its previous CPU i.e CPU#1. The next task
> goes through find_lowest_rq() and got the same CPU#1. Since this task priority
> is not more than the just queued task (already queued on CPU#1), it is sent
> to its previous CPU i.e CPU#4 in your case.
>
> From task_woken_rt() path, CPU#4 attempts push_rt_tasks(). CPU#4 is
> not overloaded,
> but we have rt_task_fits_capacity() check which forces the push. Since the CPU
> is not overloaded, your has_unfit_tasks() comes to rescue and push the
> task. Since
> the task has not scheduled in yet, it is eligible for push. You added checks
> to skip resched_curr() in push_rt_tasks() otherwise the push won't happen.
Nice summary, that's exactly what it is :)
> Finally, I understood your patch. Obviously this is not clear to me
> before. I am not
> sure if this patch is the right approach to solve this race. I will
> think a bit more.
I haven't been staring at this code for as long as you, but since we have
logic at wakeup to do a push, I think we need something here anyway for unfit
tasks.
Fixing select_task_rq_rt() to better balance tasks will help a lot in general,
but if that was enough already then why do we need to consider a push at the
wakeup at all then?
AFAIU, in SMP the whole push-pull mechanism is racy and we introduce redundancy
at taking the decision on various points to ensure we minimize this racy nature
of SMP systems. Anything could have happened between the time we called
select_task_rq_rt() and the wakeup, so we double check again before we finally
go and run. That's how I interpret it.
I am open to hear about other alternatives first anyway. Your help has been
much appreciated so far.
Thanks
--
Qais Yousef