Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently
From: Mel Gorman
Date: Sun Mar 28 2021 - 11:36:52 EST
On Fri, Mar 26, 2021 at 03:19:32PM -0400, Rik van Riel wrote:
> ---8<---
> sched,fair: bring back select_idle_smt, but differently
>
> Mel Gorman did some nice work in 9fe1f127b913
> ("sched/fair: Merge select_idle_core/cpu()"), resulting in the kernel
> being more efficient at finding an idle CPU, and in tasks spending less
> time waiting to be run, both according to the schedstats run_delay
> numbers, and according to measured application latencies. Yay.
>
> The flip side of this is that we see more task migrations (about
> 30% more), higher cache misses, higher memory bandwidth utilization,
> and higher CPU use, for the same number of requests/second.
>
> This is most pronounced on a memcache type workload, which saw
> a consistent 1-3% increase in total CPU use on the system, due
> to those increased task migrations leading to higher L2 cache
> miss numbers, and higher memory utilization. The exclusive L3
> cache on Skylake does us no favors there.
>
> On our web serving workload, that effect is usually negligible.
>
> It appears that the increased number of CPU migrations is generally
> a good thing, since it leads to lower cpu_delay numbers, reflecting
> the fact that tasks get to run faster. However, the reduced locality
> and the corresponding increase in L2 cache misses hurts a little.
>
> The patch below appears to fix the regression, while keeping the
> benefit of the lower cpu_delay numbers, by reintroducing select_idle_smt
> with a twist: when a socket has no idle cores, check to see if the
> sibling of "prev" is idle, before searching all the other CPUs.
>
> This fixes both the occasional 9% regression on the web serving
> workload, and the continuous 2% CPU use regression on the memcache
> type workload.
>
> With Mel's patches and this patch together, task migrations are still
> high, but L2 cache misses, memory bandwidth, and CPU time used are back
> down to what they were before. The p95 and p99 response times for the
> memcache type application improve by about 10% over what they were
> before Mel's patches got merged.
>
> Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
FWIW, v3 appears to have performed faster than v2 on the few tests I ran
and the patch looks fine.
Reviewed-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
--
Mel Gorman
SUSE Labs