Re: [PATCH v7] arm64: mm: Populate vmemmap at the page level if not section aligned

From: Zhenhua Huang
Date: Mon Mar 03 2025 - 04:30:50 EST




On 2025/2/27 1:13, David Hildenbrand wrote:
Sorry, I somehow missed this mail.


No problem. Thanks for your reply.

Hi David,

I had the same doubt initially.
After going through the codes, I noticed for vmemmap_populate(), the
arguments "start" and "end" passed down should already be within one
section.
early path:
for_each_present_section_nr
     __populate_section_memmap
         ..
         vmemmap_populate()

hotplug path:
__add_pages
     section_activate
         vmemmap_populate()

Therefore.. focusing only on the size seems OK to me, and fall back
solution below appears unnecessary?

Ah, in that case it is fine. Might make sense to document/enforce that
somehow for the time being ...

Shall I document and WARN_ON if the size exceeds? like:
WARN_ON(end - start > PAGES_PER_SECTION * sizeof(struct page))

Probably WARN_ON_ONCE() along with a comment that we should never exceed a single memory section.

Got it.



Since vmemmap_populate() is implemented per architecture, the change
should apply for other architectures as well. However I have no setup to
test it on...
Therefore, May I implement it only for arm64 now ?

Would work for me; better than no warning.

I made one patch several days ago, could you please help review once? https://lore.kernel.org/linux-mm/20250219084001.1272445-1-quic_zhenhuah@xxxxxxxxxxx/T/
Is there anything else you would suggest besides preferring WARN_ON_ONCE() ?


Additionally, from previous discussion, the change is worth
backporting(apologies for missing to CC stable kernel in this version).
Keeping it for arm64 should simplify for backporting. WDYT?

Jup. Of course, we could add a generic warning in a separate patch.




+/*
+ * Try to populate PMDs, but fallback to populating base pages when
ranges
+ * would only partially cover a PMD.
+ */
    int __meminit vmemmap_populate_hugepages(unsigned long start,
unsigned
long end,
                                            int node, struct vmem_altmap
*altmap)
    {
@@ -313,6 +317,9 @@ int __meminit vmemmap_populate_hugepages(unsigned
long start, unsigned long end,
           for (addr = start; addr < end; addr = next) {

This for loop appears to be redundant for arm64 as well, as above
mentioned, a single call to pmd_addr_end() should suffice.

Right, that was what was confusing me in the first place.


                   next = pmd_addr_end(addr, end);

+               if (!IS_ALIGNED(addr, PMD_SIZE) || !IS_ALIGNED(next,
PMD_SIZE))
+                       goto fallback;
+
                   pgd = vmemmap_pgd_populate(addr, node);
                   if (!pgd)
                           return -ENOMEM;
@@ -346,6 +353,7 @@ int __meminit vmemmap_populate_hugepages(unsigned
long start, unsigned long end,
                           }
                   } else if (vmemmap_check_pmd(pmd, node, addr, next))
                           continue;
+fallback:
                   if (vmemmap_populate_basepages(addr, next, node,
altmap))
                           return -ENOMEM;

It seems we have no chance to call populate_basepages here?


Can you elaborate?

It's invoked within vmemmap_populate_hugepages(), which is called by
vmemmap_populate(). This implies that we are always performing a whole
section hotplug?

Ah, you meant only in the context of this change, yes. I was confused, because there are other reasons why we run into that fallback (failing to allocate a PMD).

I observed that this logic was introduced in 2045a3b8911b("mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()") which moved from arch-dependent codes(such as vmemmap_set_pmd() in arch/loongarch/mm/init.c or arch/x86/mm/init_64.c) to common arch-independent code(function vmemmap_populate_hugepages).

I suspect it might be causing the confusion, as it may not be fully compatible with arm64. However, it does not seem to cause any bugs :)