Re: [PATCH v1 2/3] mm/memory_hotplug: don't shuffle complete zone when onlining memory
From: Dan Williams
Date: Wed Jun 17 2020 - 14:14:15 EST
On Tue, Jun 16, 2020 at 11:48 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Tue 16-06-20 10:03:31, Dan Williams wrote:
> > On Tue, Jun 16, 2020 at 10:00 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> > >
> > > On Tue, Jun 16, 2020 at 5:51 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > > >
> > > > On Tue 16-06-20 13:52:12, David Hildenbrand wrote:
> > > > > Commit e900a918b098 ("mm: shuffle initial free memory to improve
> > > > > memory-side-cache utilization") introduced shuffling of free pages
> > > > > during system boot and whenever we online memory blocks.
> > > > >
> > > > > However, whenever we online memory blocks, all pages that will be
> > > > > exposed to the buddy end up getting freed via __free_one_page(). In the
> > > > > general case, we free these pages in MAX_ORDER - 1 chunks, which
> > > > > corresponds to the shuffle order.
> > > > >
> > > > > Inside __free_one_page(), we will already shuffle the newly onlined pages
> > > > > using "to_tail = shuffle_pick_tail();". Drop explicit zone shuffling on
> > > > > memory hotplug.
> >
> > This was already explained in the initial patch submission. The
> > shuffle_pick_tail() shuffling at run time is only sufficient for
> > maintaining the shuffle. It's not sufficient for effectively
> > randomizing the free list.
>
> Yes, the "randomness" of the added memory will be lower. But is this
> observable for hotplug scenarios?
I'm not sure of the intersection of platforms using memory hotplug and
shuffling in production.
> Is memory hotplug for the normal
> memory really a thing in setups which use RAM as a cache?
I would point out again though that the utility of shuffling goes
beyond RAM-as-cache. I have seen some cost sensitive customer platform
configurations that asymmetrically populate memory controllers. Think
1 DIMM on controller0 and 2 DIMMs on controller1. In that case Linux
can get into pathological situations where an application is bandwidth
limited because it only accesses the single-DIMM backed memory range.
Shuffling balances accesses across all available memory memory
controllers restoring full memory bandwidth for that configuration. So
shuffling is used to solve problems that are otherwise invisible to
Linux, there's no indication from the platform that one memory range
has lower bandwidth than another.
> While I do agree that the code wise the shuffling per online operation
> doesn't really have any overhead really but it would be really great to
> know whether it matters at all.
I agree this is a good test case, especially considering the
"dax_kmem" solution where memory might be reserved from the initial
shuffling and onlined later. I assume there's a cross-over point where
not shuffling hotplugged memory starts to be noticeable. I just don't
have those numbers handy.