Re: [PATCH v2 1/5] mm: memory_hotplug: Memory hotplug (add) support for arm64

From: Arun KS
Date: Sun Nov 26 2017 - 01:58:23 EST


On Fri, Nov 24, 2017 at 4:23 PM, Maciej Bielski
<m.bielski@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Nov 24, 2017 at 09:42:33AM +0000, Andrea Reale wrote:
>> Hi Arun,
>>
>>
>> On Fri 24 Nov 2017, 11:25, Arun KS wrote:
>> > On Thu, Nov 23, 2017 at 4:43 PM, Maciej Bielski
>> > <m.bielski@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>> >> [ ...]
>> > > Introduces memory hotplug functionality (hot-add) for arm64.
>> > > @@ -615,6 +616,44 @@ void __init paging_init(void)
>> > > SWAPPER_DIR_SIZE - PAGE_SIZE);
>> > > }
>> > >
>> > > +#ifdef CONFIG_MEMORY_HOTPLUG
>> > > +
>> > > +/*
>> > > + * hotplug_paging() is used by memory hotplug to build new page tables
>> > > + * for hot added memory.
>> > > + */
>> > > +
>> > > +struct mem_range {
>> > > + phys_addr_t base;
>> > > + phys_addr_t size;
>> > > +};
>> > > +
>> > > +static int __hotplug_paging(void *data)
>> > > +{
>> > > + int flags = 0;
>> > > + struct mem_range *section = data;
>> > > +
>> > > + if (debug_pagealloc_enabled())
>> > > + flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>> > > +
>> > > + __create_pgd_mapping(swapper_pg_dir, section->base,
>> > > + __phys_to_virt(section->base), section->size,
>> > > + PAGE_KERNEL, pgd_pgtable_alloc, flags);
>> >
>> > Hello Andrea,
>> >
>> > __hotplug_paging runs on stop_machine context.
>> > cpu stop callbacks must not sleep.
>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/stop_machine.c?h=v4.14#n479
>> >
>> > __create_pgd_mapping uses pgd_pgtable_alloc. which does
>> > __get_free_page(PGALLOC_GFP)
>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/mm/mmu.c?h=v4.14#n342
>> >
>> > PGALLOC_GFP has GFP_KERNEL which inturn has __GFP_RECLAIM
>> >
>> > #define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
>> > #define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
>> >
>> > Now, prepare_alloc_pages() called by __alloc_pages_nodemask checks for
>> >
>> > might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
>> >
>> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/page_alloc.c?h=v4.14#n4150
>> >
>> > and then BUG()
>>
>> Well spotted, thanks for reporting the problem. One possible solution
>> would be to revert back to building the updated page tables on a copy
>> pgdir (as it was done in v1 of this patchset) and then replacing swapper
>> atomically with stop_machine.
>>
>> Actually, I am not sure if stop_machine is strictly needed,
>> if we modify the swapper pgdir live: for example, in x86_64
>> kernel_physical_mapping_init, atomicity is ensured by spin-locking on
>> init_mm.page_table_lock.
>> https://elixir.free-electrons.com/linux/v4.14/source/arch/x86/mm/init_64.c#L684
>> I'll spend some time investigating whoever else could be working
>> concurrently on the swapper pgdir.
>>
>> Any suggestion or pointer is very welcome.
>
> Hi Andrea, Arun,
>
> Alternative approach could be implementing pgd_pgtable_alloc_nosleep() and
> pointing this to hotplug_paging(). Subsequently, it could use different flags,
> eg:
>
> #define PGALLOC_GFP_NORECLAIM (__GFP_IO | __GFP_FS | __GFP_NOTRACK | __GFP_ZERO)

This solves the problem with __get_free_page.

But pgd_pgtable_alloc() -> pgtable_page_ctor() -> ptlock_alloc() and
then kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL)
Same BUG again.

Regards,
Arun

>
> Is this unefficient approach in any way?
> Do we like the fact that the memory-attaching thread can go to sleep?
>
> BR,
>
>>
>> Thanks,
>> Andrea
>>
>> > I was testing on 4.4 kernel, but cross checked with 4.14 as well.
>> >
>> > Regards,
>> > Arun
>> >
>> >
>> > > +
>> > > + return 0;
>> > > +}
>> > > +
>> > > +inline void hotplug_paging(phys_addr_t start, phys_addr_t size)
>> > > +{
>> > > + struct mem_range section = {
>> > > + .base = start,
>> > > + .size = size,
>> > > + };
>> > > +
>> > > + stop_machine(__hotplug_paging, &section, NULL);
>> > > +}
>> > > +#endif /* CONFIG_MEMORY_HOTPLUG */
>> > > +
>> > > /*
>> > > * Check whether a kernel address is valid (derived from arch/x86/).
>> > > */
>> > > --
>> > > 2.7.4
>> > >
>> >
>>
>
> --
> Maciej Bielski