Re: [PATCH v2 1/7] mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case

From: David Hildenbrand
Date: Tue Mar 03 2020 - 03:35:06 EST


On 20.02.20 05:33, Baoquan He wrote:
> In section_deactivate(), pfn_to_page() doesn't work any more after
> ms->section_mem_map is resetting to NULL in SPARSEMEM|!VMEMMAP case.
> It caused hot remove failure:
>
> kernel BUG at mm/page_alloc.c:4806!
> invalid opcode: 0000 [#1] SMP PTI
> CPU: 3 PID: 8 Comm: kworker/u16:0 Tainted: G W 5.5.0-next-20200205+ #340
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> RIP: 0010:free_pages+0x85/0xa0
> Call Trace:
> __remove_pages+0x99/0xc0
> arch_remove_memory+0x23/0x4d
> try_remove_memory+0xc8/0x130
> ? walk_memory_blocks+0x72/0xa0
> __remove_memory+0xa/0x11
> acpi_memory_device_remove+0x72/0x100
> acpi_bus_trim+0x55/0x90
> acpi_device_hotplug+0x2eb/0x3d0
> acpi_hotplug_work_fn+0x1a/0x30
> process_one_work+0x1a7/0x370
> worker_thread+0x30/0x380
> ? flush_rcu_work+0x30/0x30
> kthread+0x112/0x130
> ? kthread_create_on_node+0x60/0x60
> ret_from_fork+0x35/0x40
>
> Let's move the ->section_mem_map resetting after depopulate_section_memmap()
> to fix it.
>
> Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
> ---
> mm/sparse.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 596b2a45b100..b8e52c8fed7f 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -779,13 +779,15 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
> ms->usage = NULL;
> }
> memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
> - ms->section_mem_map = (unsigned long)NULL;
> }
>
> if (section_is_early && memmap)
> free_map_bootmem(memmap);
> else
> depopulate_section_memmap(pfn, nr_pages, altmap);
> +
> + if (bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION))
> + ms->section_mem_map = (unsigned long)NULL;
> }
>
> static struct page * __meminit section_activate(int nid, unsigned long pfn,
>

As discussed, I think this is broken here already. As you explained, the
subsection_map can get freed via kfree(ms->usage) after the first
bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION) check.

--
Thanks,

David / dhildenb