Re: [PATCH 0/2] Introduce SIS_CACHE to choose previous CPU during task wakeup

From: Chen Yu
Date: Thu Sep 28 2023 - 04:23:36 EST


Hi Ingo,

On 2023-09-27 at 10:00:11 +0200, Ingo Molnar wrote:
>
> * Chen Yu <yu.c.chen@xxxxxxxxx> wrote:
>
> > When task p is woken up, the scheduler leverages select_idle_sibling()
> > to find an idle CPU for it. p's previous CPU is usually a preference
> > because it can improve cache locality. However in many cases, the
> > previous CPU has already been taken by other wakees, thus p has to
> > find another idle CPU.
> >
> > Inhibit the task migration while keeping the work conservation of
> > scheduler could benefit many workloads. Inspired by Mathieu's
> > proposal to limit the task migration ratio[1], this patch considers
> > the task average sleep duration. If the task is a short sleeping one,
> > then tag its previous CPU as cache hot for a short while. During this
> > reservation period, other wakees are not allowed to pick this idle CPU
> > until a timeout. Later if the task is woken up again, it can find its
> > previous CPU still idle, and choose it in select_idle_sibling().
>
> Yeah, so I'm not convinced about this at this stage.
>
> By allowing a task to basically hog a CPU after it has gone idle already,
> however briefly, we reduce resource utilization efficiency for the sake
> of singular benchmark workloads.
>

Currently in the code we do not really reserve the idle CPU or force it
to be idle. We just give other wakee a search sequence suggestion to find
the idle CPU. If all idle CPUs are in reserved state, the first reserved idle
CPU will be picked up rather than left it in idle. This can fully utilize the
idle CPU resource. The main impact is the wakeup latency if I understand
correctly. Let me run the latest schbench and monitor these latency statistics
in detail.

> In a mixed environment the cost of leaving CPUs idle longer than necessary
> will show up - and none of these benchmarks show that kind of side effect
> and indirect overhead.
>
> This feature would be a lot more convincing if it tried to measure overhead
> in the pathological case, not the case it's been written for.
>

Thanks for the suggestion, Ingo. Yes, we should launch more tests to evaluate this
proposal. As Tim mentioned, we have previously tested it using OLTP benchmark
as described in PATCH [2/2]. I'm thinking of running more benchmarks to get
a wider understanding of how this change would impact them, both positive and
negative part.

thanks,
Chenyu