Re: [PATCH v8 -tip 06/26] sched: Add core wide task selection and scheduling.

From: Joel Fernandes
Date: Fri Oct 23 2020 - 13:57:28 EST


On Fri, Oct 23, 2020 at 03:54:00PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 23, 2020 at 03:51:29PM +0200, Peter Zijlstra wrote:
> > On Mon, Oct 19, 2020 at 09:43:16PM -0400, Joel Fernandes (Google) wrote:
> > > + /*
> > > + * If this sibling doesn't yet have a suitable task to
> > > + * run; ask for the most elegible task, given the
> > > + * highest priority task already selected for this
> > > + * core.
> > > + */
> > > + p = pick_task(rq_i, class, max);
> > > + if (!p) {
> > > + /*
> > > + * If there weren't no cookies; we don't need to
> > > + * bother with the other siblings.
> > > + * If the rest of the core is not running a tagged
> > > + * task, i.e. need_sync == 0, and the current CPU
> > > + * which called into the schedule() loop does not
> > > + * have any tasks for this class, skip selecting for
> > > + * other siblings since there's no point. We don't skip
> > > + * for RT/DL because that could make CFS force-idle RT.
> > > + */
> > > + if (i == cpu && !need_sync && class == &fair_sched_class)
> > > + goto next_class;
> > > +
> > > + continue;
> > > + }
> >
> > I'm failing to understand the class == &fair_sched_class bit.

The last line in the comment explains it "We don't skip for RT/DL because
that could make CFS force-idle RT.".

Even if need_sync == false, we need to go look at other CPUs (non-local
CPUs) to see if they could be running RT.

Say the RQs in a particular core look like this:
Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task.

rq0 rq1
CFS1 (tagged) RT1 (not tag)
CFS2 (tagged)

Say schedule() runs on rq0. Now, it will enter the above loop and
pick_task(RT) will return NULL for 'p'. It will enter the above if() block
and see that need_sync == false and will skip RT entirely.

The end result of the selection will be (say prio(CFS1) > prio(CFS2)):
rq0 rq1
CFS1 IDLE

When it should have selected:
rq0 r1
IDLE RT

I saw this issue on real-world usecases in ChromeOS where an RT task gets
constantly force-idled and breaks RT. The "class == &fair_sched_class" bit
cures it.

> > > + * for RT/DL because that could make CFS force-idle RT.
> > IIRC the condition is such that the core doesn't have a cookie (we don't
> > need to sync the threads) so we'll only do a pick for our local CPU.
> >
> > That should be invariant of class.
>
> That is; it should be the exact counterpart of this bit:
>
> > + /*
> > + * Optimize the 'normal' case where there aren't any
> > + * cookies and we don't need to sync up.
> > + */
> > + if (i == cpu && !need_sync && !p->core_cookie) {
> > + next = p;
> > + goto done;
> > + }
>
> If there is no task found in this class, try the next class, if there
> is, we done.

That's Ok. But we cannot skip RT class on other CPUs.

thanks,

- Joel