Re: CXL Boot to Bash - Section 3: Memory (block) Hotplug

From: David Hildenbrand
Date: Tue Feb 18 2025 - 16:04:01 EST

Next message: Nícolas F. R. A. Prado: "[PATCH 3/6] ASoC: mediatek: mt8188: Add DMIC DAI driver"
Previous message: Nícolas F. R. A. Prado: "[PATCH 6/6] arm64: dts: mediatek: mt8390-genio-common: Add routes for DMIC"
In reply to: Gregory Price: "Re: CXL Boot to Bash - Section 3: Memory (block) Hotplug"
Next in thread: Gregory Price: "Re: CXL Boot to Bash - Section 3: Memory (block) Hotplug"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 18.02.25 21:25, Gregory Price wrote:

On Tue, Feb 18, 2025 at 08:25:59PM +0100, David Hildenbrand wrote:

On 18.02.25 19:04, Gregory Price wrote:

Hm?

If you enable memmap_on_memory, we will place the memmap on that carved-out
region, independent of ZONE_NORMAL/ZONE_MOVABLE etc. It's the "altmap".

Reason that we can place the memmap on a ZONE_MOVABLE is because, although
it is "unmovable", we told memory offlining code that it doesn't have to
care about offlining that memmap carveout, there is no migration to be done.
Just offline the block (memmap gets stale) and remove that block (memmap
gets removed).

If there is a reason where we carve out the memmap and *not* use it, that
case must be fixed.

Hm, I managed to trace down the wrong path on this particular code.

I will go back and redo my tests to sanity check, but here's what I
would expect to see:

1) if memmap_on_memory is off, and hotplug capacity (node1) is
zone_movable - then zone_normal (node0) should have N pages
accounted in nr_memmap_pages

Right, we'll allocate the memmap from the buddy, which ends up
allocating from ZONE_NORMAL on that node.

1a) when dropping these memory blocks, I should see node0 memory
use drop by 4GB - since this is just GFP_KERNEL pages.

I assume you mean "when hotunplugging them". Yes, we should be freeing the memmap back to the buddy.

2) if memmap_on_memory is on, and hotplug capacity (node1) is
zone_movable - then each memory block (256MB) should appear
as 252MB (-4MB of 64-byte page structs). For 256GB (my system)
I should see a total of 252GB of onlined memory (-4GB of page struct)

In memory_block_online(), we have:

/*
* Account once onlining succeeded. If the zone was unpopulated, it is
* now already properly populated.
*/
if (nr_vmemmap_pages)
adjust_present_page_count(pfn_to_page(start_pfn), mem->group,
nr_vmemmap_pages);

So we'd add the vmemmap pages to
* zone->present_pages
* zone->zone_pgdat->node_present_pages

(mhp_init_memmap_on_memory() moved the vmemmap pages to ZONE_MOVABLE)

However, we don't add them to
* zone->managed_pages
* totalram pages

/proc/zoneinfo would show them as present but not managed.
/proc/meminfo would not include them in MemTotal

We could adjust the latter two, if there is a problem.
(just needs some adjust_managed_page_count() calls)

So yes, staring at MemTotal, you should see an increase by 252 MiB right now.

2a) when dropping these memory blocks, I should see node0 memory use
stay the same - since it was vmemmap usage.

Yes.

I will double check that this isn't working as expected, and i'll double
check for a build option as well.

stupid question - it sorta seems like you'd want this as the default
setting for driver-managed hotplug memory blocks, but I suppose for
very small blocks there's problems (as described in the docs).

The issue is that it is per-memblock. So you'll never have 1 GiB ranges
of consecutive usable memory (e.g., 1 GiB hugetlb page).

:thinking: - is it silly to suggest maybe a per-driver memmap_on_memory
setting rather than just a global setting? For CXL capacity, this seems
like a no-brainer since blocks can't be smaller than 256MB (per spec).

I thought we had that? See MHP_MEMMAP_ON_MEMORY set by dax/kmem.

IIRC, the global toggle must be enabled for the driver option to be considered.

--
Cheers,

David / dhildenb

Next message: Nícolas F. R. A. Prado: "[PATCH 3/6] ASoC: mediatek: mt8188: Add DMIC DAI driver"
Previous message: Nícolas F. R. A. Prado: "[PATCH 6/6] arm64: dts: mediatek: mt8390-genio-common: Add routes for DMIC"
In reply to: Gregory Price: "Re: CXL Boot to Bash - Section 3: Memory (block) Hotplug"
Next in thread: Gregory Price: "Re: CXL Boot to Bash - Section 3: Memory (block) Hotplug"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]