Re: [PATCH] sched/fair: check for idle core

From: Julia Lawall
Date: Wed Oct 21 2020 - 09:49:04 EST




On Wed, 21 Oct 2020, Mel Gorman wrote:

> On Wed, Oct 21, 2020 at 02:56:06PM +0200, Julia Lawall wrote:
> >
> >
> > On Wed, 21 Oct 2020, Mel Gorman wrote:
> >
> > > On Wed, Oct 21, 2020 at 02:25:32PM +0200, Vincent Guittot wrote:
> > > > > I see Vincent already agreed with the patch so I could be wrong. Vincent,
> > > > > did I miss something stupid?
> > > >
> > > > This patch fixes the problem that we don't favor anymore the prev_cpu when it is idle since
> > > > commit 11f10e5420f6ce because load is not null when cpu is idle whereas runnable_load was
> > > > And this is important because this will then decide in which LLC we will looks for a cpu
> > > >
> > >
> > > Ok, that is understandable but I'm still concerned that the fix simply
> > > trades one problem for another by leaving related tasks remote to each
> > > other and increasing cache misses and remote data accesses.
> > >
> > > wake_affine_weight is a giant pain because really we don't care about the
> > > load on the waker CPU or its available, we care about whether it has idle
> > > siblings that can be found quickly. As tempting as ripping it out is,
> > > it never happened because sometimes it makes the right decision.
> >
> > My goal was to restore the previous behavior, when runnable load was used.
> > The patch removing the use of runnable load (11f10e5420f6) presented it
> > basically as that load balancing was using it, so wakeup should use it
> > too, and any way it didn't matter because idle CPUS were checked for
> > anyway.
> >
>
> Which is fair.
>
> > Is your point of view that the proposed change is overkill? Or is it that
> > the original behavior was not desirable?
> >
>
> I worry it's overkill because prev is always used if it is idle even
> if it is on a node remote to the waker. It cuts off the option of a
> wakee moving to a CPU local to the waker which is not equivalent to the
> original behaviour.

Could it be possible to check p->recent_used_cpu? If that is prev (or on
the same socket?), then prev could be a good choice. If that is on the
same socket as the waker, then maybe the waker would be better.

julia