Re: [RFC PATCH] sched/fair: Introduce SIS_PAIR to wakeup task on local idle core first

From: Mike Galbraith
Date: Thu May 25 2023 - 05:34:28 EST


On Thu, 2023-05-25 at 15:47 +0800, Chen Yu wrote:
> On 2023-05-22 at 09:10:33 +0200, Mike Galbraith wrote:
> >
> > At one extreme of the huge spectrum of possibilities, a couple less
> > than brilliant tasks playing high speed ping-pong can bounce all over a
> > box with zero consequences, but for a pair more akin to say Einstein
> > and Bohr pondering chalkboards full of mind bending math and meeting
> > occasionally at the water cooler to exchange snarky remarks, needlessly
> > bouncing them about forces them to repopulate chalkboards, and C2C
> > traffic you try to avoid via bounce you generate via bounce.
> >
> I guess what you mean is that, for a wakee has large local data cache
> footprint, it is not a good idea to wakeup the wakee on a remote core.
> Because in that way the wakee has to repopulate the cache from scratch.

Yeah, and all variations in between.

> Yes, the problem is that currently the scheduler is lacking of metric
> to indicate the task's working set, or per-task-cache-footprint-track
> (although we have numa balancing to calculate per-task-node-statistics).
> If provided with this cache-aware metric, the wakee can be put to a candidate
> CPU where the cache locallity(either LLC or L2) is friendly to the wakee.
> Because there is no such accurate metric, the heuristic seems to be an compromised
> way to predict the task placement.

Nah, it's a dart toss. With a box full of net blaster tools, the odds
may even be favorable, but who knows what the wild will do.

> The C2C was mainly caused by accessing global tg->load, so besides
> wakeup placement, there should also be other way to mitigate C2C,
> such as reducing the frequency of accessing tg->load.

Attacking that is the only thing that makes any sense to me.

> Besides that, while studying the history of wake_wide(), I suddenly
> found that 10 years ago Michael has proposed exactly the same strategy to
> check if task A and B are waking up each other, if they are, put them
> together, otherwise, spread them to different LLC:
> https://lkml.org/lkml/2013/3/6/73
> And this version has finnaly evolved to what wake_wide() looks like today
> in your patch:

Yeah, I've touched that, but it's still busted. I watched firefox
burst wake a way too big thread pool, but since worker-bees collected
zero flips, the heuristic says all is well, move along ginormous swarm.

-Mike