Re: [PATCH v7 1/2] mm/memory hotplug: fix zone->contiguous always false when hotplug
From: Li, Tianyou
Date: Thu Jan 08 2026 - 02:35:47 EST
Very appreciated for your review David! The suggestions you made are
clear and the code/comments you posted are well formatted, I can even
copy/pasted without modification. Thanks.
Regards,
Tianyou
On 1/7/2026 4:03 AM, David Hildenbrand (Red Hat) wrote:
On 12/22/25 15:58, Tianyou Li wrote:
Function set_zone_contiguous used __pageblock_pfn_to_page to
check the whole pageblock is in the same zone. One assumption is
the memory section must online, otherwise the __pageblock_pfn_to_page
will return NULL, then the set_zone_contiguous will be false.
When move_pfn_range_to_zone invoked set_zone_contiguous, since the
memory section did not online, the return value will always be false.
To fix this issue, we removed the set_zone_contiguous from the
move_pfn_range_to_zone, and place it after memory section onlined.
Function remove_pfn_range_from_zone did not have this issue because
memory section remains online at the time set_zone_contiguous invoked.
The description is a bit hard to follow. Let me try:
"set_zone_contiguous() uses __pageblock_pfn_to_page() to detect
pageblocks that either do not exist (hole) or that do not belong to
the same zone.
__pageblock_pfn_to_page(), however, relies on pfn_to_online_page(),
effectively always returning NULL for memory ranges that were not
onlined yet. So when called on a range-to-be-onlined, it indicates a
memory hole to set_zone_contiguous().
Consequently, the set_zone_contiguous() call in
move_pfn_range_to_zone(), which happens early during memory onlining,
will never detect a zone as being contiguous. Bad.
To fix the issue, move the set_zone_contiguous() call to a later stage
in memory onlining, where pfn_to_online_page() will succeed: after we
mark the memory sections to be online"
Will change accordingly. Thanks.
Now, there is no need to add the handling to
mhp_init_memmap_on_memory(). Note how mhp_init_memmap_on_memory() in
memory_block_online() is always followed by online_pages().
Plus there is no dependencies of previous zone contiguous state for the
set_zone_contiguous now, it totally makes sense to remove the
set_zone_contiguous in mhp_init_memmap_on_memory() as you suggested.
So, it's sufficient to move it after the online_pages_range(). I would
also add a comment there saying something like:
/*
* Now that the ranges are indicated as online, check whether the whole
* zone is contiguous.
*/
Will change accordingly. Thanks.
Can we find some Fixes: tag (which commit introduced the regression)?
Likely we want to CC stable.
Yes, probably we can add the tags as below, where the pfn_to_page()
changed to pfn_to_online_page() in __pageblock_pfn_to_page().
Fixes: 2d070eab2e82 (mm: consider zone which is not fully populated to
have holes)
Cc: Michal Hocko <mhocko@xxxxxxxx>