On 9/19/2022 2:00 AM, David Hildenbrand wrote:
Hi Dough,Thanks for your interest. I will attempt to answer them.
I have some high-level questions.
I'm not certain what is typical because these systems are highly
MOTIVATION:
Some Broadcom devices (e.g. 7445, 7278) contain multiple memory
controllers with each mapped in a different address range within
a Uniform Memory Architecture. Some users of these systems have
How large are these areas typically?
How large are they in comparison to other memory in the system?
How is this memory currently presented to the system?
configurable and Broadcom's customers have different ideas about
application processing.
The 7278 device has four ARMv8 CPU cores in an SMP cluster and two
memory controllers (MEMCs). Each MEMC is capable of controlling up to
8GB of DRAM. An example 7278 system might have 1GB on each controller,
so an arm64 kernel might see 1GB on MEMC0 at 0x40000000-0x7FFFFFFF and
1GB on MEMC1 at 0x300000000-0x33FFFFFFF.
The Designated Movable Block concept introduced here has the potential
to offer useful services to different constituencies. I tried to
highlight this in my V1 patch set with the hope of attracting some
interest, but it can complicate the overall discussion, so I would like
to maybe narrow the discussion here. It may be good to keep them in mind
when assessing the overall value, but perhaps the "other opportunities"
can be covered as a follow on discussion.
The base capability described in commits 7-15 of this V1 patch set is to
allow a 'movablecore' block to be created at a particular base address
rather than solely at the end of addressable memory.
ZONE_MOVABLE memory is effectively unusable by the kernel. It can be
expressed the desire to locate ZONE_MOVABLE memory on each
memory controller to allow user space intensive processing to
make better use of the additional memory bandwidth.
Can you share some more how exactly ZONE_MOVABLE would help here to make
better use of the memory bandwidth?
used by user space applications through both the page allocator and the
Hugetlbfs. If a large 'movablecore' allocation is defined and it can
only be located at the end of addressable memory then it will always be
located on MEMC1 of a 7278 system. This will create a tendency for user
space accesses to consume more bandwidth on the MEMC1 memory controller
and kernel space accesses to consume more bandwidth on MEMC0. A more
even distribution of ZONE_MOVABLE memory between the available memory
controllers in theory makes more memory bandwidth available to user
space intensive loads.
I believe my changes address all such reliance, but if you are aware of
Unfortunately, the historical monotonic layout of zones would
mean that if the lowest addressed memory controller contains
ZONE_MOVABLE memory then all of the memory available from
memory controllers at higher addresses must also be in the
ZONE_MOVABLE zone. This would force all kernel memory accesses
onto the lowest addressed memory controller and significantly
reduce the amount of memory available for non-movable
allocations.
We do have code that relies on zones during boot to not overlap within a
single node.
something I missed please let me know.
That may be true, but I found it superior to the 'sticky' movable
The main objective of this patch set is therefore to allow a
block of memory to be designated as part of the ZONE_MOVABLE
zone where it will always only be used by the kernel page
allocator to satisfy requests for movable pages. The term
Designated Movable Block is introduced here to represent such a
block. The favored implementation allows modification of the
Sorry to say, but that term is rather suboptimal to describe what you
are doing here. You simply have some system RAM you'd want to have
managed by ZONE_MOVABLE, no?
terminology put forth by Mel Gorman ;). I'm happy to entertain
alternatives, but they may not be as easy to find as you think.
The notion would be to consider each memory controller as a separate
'movablecore' kernel parameter to allow specification of a base
address and support for multiple blocks. The existing
'movablecore' mechanisms are retained. Other mechanisms based on
device tree are also included in this set.
BACKGROUND:
NUMA architectures support distributing movablecore memory
across each node, but it is undesirable to introduce the
overhead and complexities of NUMA on systems that don't have a
Non-Uniform Memory Architecture.
How exactly would that look like? I think I am missing something :)
node, but as stated it is not desirable.
We and others have encountered significant performance issues when large
Why can't we simply designate these regions as CMA regions?
CMA regions are used. There are significant restrictions on the page
allocator's use of MIGRATE_CMA pages and the memory subsystem works very
hard to keep about half of the memory in the CMA region free. There have
been attempts to patch the CMA implementation to alter this behavior
(for example the set I referenced Mel's response to in [1]), but there
are users that desire the current behavior.
One of the "other opportunities" for Designated Movable Blocks is to
Why do we have to start using ZONE_MOVABLE for them?
allow CMA to allocate from a DMB as an alternative. This would allow
current users to continue using CMA as they want, but would allow users
(e.g. hugetlb_cma) that are not sensitive to the allocation latency to
let the kernel page allocator make more complete use (i.e. waste less)
of the shared memory. ZONE_MOVABLE pageblocks are always MIGRATE_MOVABLE
so the restrictions placed on MIGRATE_CMA pageblocks are lifted within a
DMB.
Thanks for your consideration,
Dough Baker ... I mean Doug Berger :).