Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

From: Michal Hocko
Date: Tue Nov 27 2018 - 13:21:42 EST


On Tue 27-11-18 19:17:27, Michal Hocko wrote:
> On Tue 27-11-18 09:08:50, Linus Torvalds wrote:
> > On Mon, Nov 26, 2018 at 10:24 PM kernel test robot
> > <rong.a.chen@xxxxxxxxx> wrote:
> > >
> > > FYI, we noticed a -61.3% regression of vm-scalability.throughput due
> > > to commit ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for
> > > MADV_HUGEPAGE mappings")
> >
> > Well, that's certainly noticeable and not good.
> >
> > Andrea, I suspect it might be causing fights with auto numa migration..
> >
> > Lots more system time, but also look at this:
> >
> > > 1122389 ± 9% +17.2% 1315380 ± 4% proc-vmstat.numa_hit
> > > 214722 ± 5% +21.6% 261076 ± 3% proc-vmstat.numa_huge_pte_updates
> > > 1108142 ± 9% +17.4% 1300857 ± 4% proc-vmstat.numa_local
> > > 145368 ± 48% +63.1% 237050 ± 17% proc-vmstat.numa_miss
> > > 159615 ± 44% +57.6% 251573 ± 16% proc-vmstat.numa_other
> > > 185.50 ± 81% +8278.6% 15542 ± 40% proc-vmstat.numa_pages_migrated
> >
> > Should the commit be reverted? Or perhaps at least modified?
>
> Well, the commit is trying to revert to the behavior before
> 5265047ac301 because there are real usecases that suffered from that
> change and bug reports as a result of that.
>
> will-it-scale is certainly worth considering but it is an artificial
> testcase. A higher NUMA miss rate is an expected side effect of the
> patch because the fallback to a different NUMA node is more likely. The
> __GFP_THISNODE side effect is basically introducing node-reclaim
> behavior for THPages. Another thing is that there is no good behavior
> for everybody. Reclaim locally vs. THP on a remote node is hard to
> tell by default. We have discussed that at length and there were some
> conclusions. One of them is that we need a numa policy to tell whether
> a expensive localility is preferred over remote allocation. Also we
> definitely need a better pro-active defragmentation to allow larger
> pages on a local node. This is a work in progress and this patch is a
> stop gap fix.

Btw. the associated discussion is http://lkml.kernel.org/r/20180925120326.24392-1-mhocko@xxxxxxxxxx

--
Michal Hocko
SUSE Labs