Re: [PATCH v9 4/8] mm,memory_hotplug: Allocate memmap from the added memory range

From: David Hildenbrand
Date: Wed Apr 21 2021 - 04:52:28 EST


On 21.04.21 10:49, Michal Hocko wrote:
On Wed 21-04-21 10:44:38, David Hildenbrand wrote:
On 21.04.21 10:39, Michal Hocko wrote:
On Wed 21-04-21 10:15:46, Oscar Salvador wrote:
On Tue, Apr 20, 2021 at 12:56:03PM +0200, Michal Hocko wrote:
[...]
necessary. Using two different iteration styles is also hurting the code
readability. I would go with the following
for (pfn = start_pfn; pfn < end_pfn; ) {
unsigned long order = min(MAX_ORDER - 1UL, __ffs(pfn));

while (start + (1UL << order) > end_pfn)
order--;
(*online_page_callback)(pfn_to_page(pfn), pageblock_order);
pfn += 1 << order;
}

which is what __free_pages_memory does already.

this is kinda what I used to have in the early versions, but it was agreed
with David to split it in two loops to make it explicit.
I can go back to that if it is preferred.

Not that I would insist but I find it better to use common constructs
when it doesn't hurt readability. The order evaluation can be even done
in a trivial helper.

+ if (memmap_on_memory) {
+ nr_vmemmap_pages = walk_memory_blocks(start, size, NULL,
+ get_nr_vmemmap_pages_cb);
+ if (nr_vmemmap_pages) {
+ if (size != memory_block_size_bytes()) {
+ pr_warn("Refuse to remove %#llx - %#llx,"
+ "wrong granularity\n",
+ start, start + size);
+ return -EINVAL;
+ }
+
+ /*
+ * Let remove_pmd_table->free_hugepage_table do the
+ * right thing if we used vmem_altmap when hot-adding
+ * the range.
+ */
+ mhp_altmap.alloc = nr_vmemmap_pages;
+ altmap = &mhp_altmap;
+ }
+ }
+
/* remove memmap entry */
firmware_map_remove(start, start + size, "System RAM");

I have to say I still dislike this and I would just wrap it inside out
and do the operation from within walk_memory_blocks but I will not
insist.

I have to confess I forgot about the details of that dicussion, as we were
quite focused on decoupling vmemmap pages from {online,offline} interface.
Would you mind elaborating a bit more?

As I've said I will not insist and this can be done in the follow up.
You are iterating over memory blocks just to refuse to do an operation
which can be split to several memory blocks. See
http://lkml.kernel.org/r/YFtPxH0CT5QZsnR1@xxxxxxxxxxxxxx and follow
walk_memory_blocks(start, size, NULL, remove_memory_block_cb)


We'll have to be careful in general when removing memory in different
granularity than it was added, especially calling arch_remove_memory() in
smaller granularity than it was added via arch_add_memory(). We might fail
to tear down the direct map, imagine having mapped a 1GiB page but decide to
remove individual 128 MiB chunks -- that won't work and the direct map would
currently remain.

Agreed but I am not referring to arbitrary hotremove path. All I am
pointing at is to split up to memory blocks and do the same kind of work
on each separately. Partial failures might turn out to be more tricky
and as I've said I do not insist on doing that right now but it is a bit
weird to outright fail the operation even when in fact there are more
blocks to be hot removed in once.

Agreed. But we should also focus on what actual users need to see if it's worth the trouble (I know of none that will be using memmap_on_memory).


--
Thanks,

David / dhildenb