Re: [RFC][PATCH 15/16] sched: Trivial forced-newidle balancer
From: Peter Zijlstra
Date: Thu Feb 21 2019 - 11:47:12 EST
On Thu, Feb 21, 2019 at 05:41:46PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 21, 2019 at 04:19:46PM +0000, Valentin Schneider wrote:
> > Hi,
> >
> > On 18/02/2019 16:56, Peter Zijlstra wrote:
> > [...]
> > > +static bool try_steal_cookie(int this, int that)
> > > +{
> > > + struct rq *dst = cpu_rq(this), *src = cpu_rq(that);
> > > + struct task_struct *p;
> > > + unsigned long cookie;
> > > + bool success = false;
> > > +
> > > + local_irq_disable();
> > > + double_rq_lock(dst, src);
> > > +
> > > + cookie = dst->core->core_cookie;
> > > + if (!cookie)
> > > + goto unlock;
> > > +
> > > + if (dst->curr != dst->idle)
> > > + goto unlock;
> > > +
> > > + p = sched_core_find(src, cookie);
> > > + if (p == src->idle)
> > > + goto unlock;
> > > +
> > > + do {
> > > + if (p == src->core_pick || p == src->curr)
> > > + goto next;
> > > +
> > > + if (!cpumask_test_cpu(this, &p->cpus_allowed))
> > > + goto next;
> > > +
> > > + if (p->core_occupation > dst->idle->core_occupation)
> > > + goto next;
> > > +
> >
> > IIUC, we're trying to find/steal tasks matching the core_cookie from other
> > rqs because dst has been cookie-forced-idle.
> >
> > If the p we find isn't running, what's the meaning of core_occupation?
> > I would have expected it to be 0, but we don't seem to be clearing it when
> > resetting the state in pick_next_task().
>
> Indeed. We preserve the occupation from the last time around; it's not
> perfect but its better than nothing.
>
> Consider there's two groups; and we just happen to run the other group.
> Then our occopation, being what it was last, is still accurate. When
> next we run, we'll again get that many siblings together.
>
> > If it is running, we prevent the stealing if the core it's on is running
> > more matching tasks than the core of the pulling rq. It feels to me as if
> > that's a balancing tweak to try to cram as many matching tasks as possible
> > in a single core, so to me this reads as "don't steal my tasks if I'm
> > running more than you are, but I will steal tasks from you if I'm given
> > the chance". Is that correct?
>
> Correct, otherwise an SMT4 with 5 tasks could end up ping-ponging the
> one task forever.
>
> Note that a further condition a little up the callchain from here only
> does this stealing if the thread was forced-idle -- ie. it had something
> to run anyway. So under the condition where there simple aren't enough
> tasks to keep all siblings busy, we'll not compact just cause.
Better example; it will avoid stealing a task from a full SMT2 core to
fill another.