Re: [PATCH v9 1/8] mm: Add per-cpu logic to page shuffling

From: Michal Hocko
Date: Tue Sep 10 2019 - 08:11:35 EST

On Mon 09-09-19 08:11:36, Alexander Duyck wrote:
> On Mon, 2019-09-09 at 10:14 +0200, David Hildenbrand wrote:
> > On 07.09.19 19:25, Alexander Duyck wrote:
> > > From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
> > >
> > > Change the logic used to generate randomness in the suffle path so that we
> > > can avoid cache line bouncing. The previous logic was sharing the offset
> > > and entropy word between all CPUs. As such this can result in cache line
> > > bouncing and will ultimately hurt performance when enabled.
> >
> > So, usually we perform such changes if there is real evidence. Do you
> > have any such performance numbers to back your claims?
> I'll have to go rerun the test to get the exact numbers. The reason this
> came up is that my original test was spanning NUMA nodes and that made
> this more expensive as a result since the memory was both not local to the
> CPU and was being updated by multiple sockets.

What was the pattern of page freeing in your testing? I am wondering
because order 0 pages should be prevailing and those usually go via pcp
lists so they do not get shuffled unless the batch is full IIRC.
Michal Hocko