Re: [PATCH] numa,sched: only consider less busy nodes as numa balancing destination

From: Artem Bityutskiy
Date: Fri May 08 2015 - 09:13:34 EST

On Wed, 2015-05-06 at 11:41 -0400, Rik van Riel wrote:
> On Wed, 06 May 2015 13:35:30 +0300
> Artem Bityutskiy <dedekind1@xxxxxxxxx> wrote:
> > we observe a tremendous regression between kernel version 3.16 and 3.17
> > (and up), and I've bisected it to this commit:
> >
> > a43455a sched/numa: Ensure task_numa_migrate() checks the preferred node
> Artem, Jirka, does this patch fix (or at least improve) the issues you
> have been seeing? Does it introduce any new regressions?

Hi Rik,

first of all thanks for your help!

I've tried this patch and it has very small effect. I've also ran the
benchmark with auto-NUMA disabled too, which is useful, I think. I used
the tip of Linuses tree (v4.1-rc2+).

Kernel Avg response time, ms
Vanilla 1481
Patched 1240
Reverted 256
Disabled 309

Vanilla: pristine v4.1-rc2+
Patched: Vanilla + this patch
Reverted: Vanilla + a revert of a43455a
Disabled: Vanilla and auto-NUMA disabled via procfs

I ran the benchmark for 1 hour for every configuration this time. I
cannot say for sure the deviation right now, but I think it is tens of
milliseconds, so disabled vs reverted _may_ be within the error range,
but I need to do more experiments.

So this patch dropped the average Web server response time dropped from
about 1.4 seconds to about 1.2 seconds, which isn't a bad improvement,
but it is far less than what we get when reverting that patch.


