Re: [PATCH 00/21] mm: introduce Designated Movable Blocks
From: Mike Rapoport
Date: Fri Sep 23 2022 - 07:20:23 EST
Hi Doug,
I only had time to skim through the patches and before diving in I'd like
to clarify a few things.
On Mon, Sep 19, 2022 at 06:03:55PM -0700, Doug Berger wrote:
> On 9/19/2022 2:00 AM, David Hildenbrand wrote:
> >
> > How is this memory currently presented to the system?
>
> The 7278 device has four ARMv8 CPU cores in an SMP cluster and two memory
> controllers (MEMCs). Each MEMC is capable of controlling up to 8GB of DRAM.
> An example 7278 system might have 1GB on each controller, so an arm64 kernel
> might see 1GB on MEMC0 at 0x40000000-0x7FFFFFFF and 1GB on MEMC1 at
> 0x300000000-0x33FFFFFFF.
>
> The base capability described in commits 7-15 of this V1 patch set is to
> allow a 'movablecore' block to be created at a particular base address
> rather than solely at the end of addressable memory.
I think this capability is only useful when there is non-uniform access to
different memory ranges. Otherwise it wouldn't matter where the movable
pages reside. The system you describe looks quite NUMA to me, with two
memory controllers, each for accessing a partial range of the available
memory.
> > > expressed the desire to locate ZONE_MOVABLE memory on each
> > > memory controller to allow user space intensive processing to
> > > make better use of the additional memory bandwidth.
> >
> > Can you share some more how exactly ZONE_MOVABLE would help here to make
> > better use of the memory bandwidth?
>
> ZONE_MOVABLE memory is effectively unusable by the kernel. It can be used by
> user space applications through both the page allocator and the Hugetlbfs.
> If a large 'movablecore' allocation is defined and it can only be located at
> the end of addressable memory then it will always be located on MEMC1 of a
> 7278 system. This will create a tendency for user space accesses to consume
> more bandwidth on the MEMC1 memory controller and kernel space accesses to
> consume more bandwidth on MEMC0. A more even distribution of ZONE_MOVABLE
> memory between the available memory controllers in theory makes more memory
> bandwidth available to user space intensive loads.
The theory makes perfect sense, but is there any practical evidence of
improvement?
Some benchmark results that illustrate the difference would be nice.
> > > BACKGROUND:
> > > NUMA architectures support distributing movablecore memory
> > > across each node, but it is undesirable to introduce the
> > > overhead and complexities of NUMA on systems that don't have a
> > > Non-Uniform Memory Architecture.
> >
> > How exactly would that look like? I think I am missing something :)
>
> The notion would be to consider each memory controller as a separate node,
> but as stated it is not desirable.
Why?
> Thanks for your consideration,
> Dough Baker ... I mean Doug Berger :).
--
Sincerely yours,
Mike.