Re: [PATCH v2 2/2] memblock: do not start bottom-up allocations with kernel_end
From: Roman Gushchin
Date: Sat Dec 19 2020 - 12:07:04 EST
On Sat, Dec 19, 2020 at 11:52:19PM +0900, Wonhyuk Yang wrote:
> Hi Roman,
>
> On Fri, Dec 18, 2020 at 5:12 AM Roman Gushchin <guro@xxxxxx> wrote:
> >
> > With kaslr the kernel image is placed at a random place, so starting
> > the bottom-up allocation with the kernel_end can result in an
> > allocation failure and a warning like this one:
> >
> > [ 0.002920] hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
> > [ 0.002921] ------------[ cut here ]------------
> > [ 0.002922] memblock: bottom-up allocation failed, memory hotremove may be affected
> > [ 0.002937] WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
> > [ 0.002956] Call Trace:
> > [ 0.002961] ? memblock_alloc_range_nid+0x8d/0x11e
> > [ 0.002963] ? cma_declare_contiguous_nid+0x2c4/0x38c
> > [ 0.002964] ? hugetlb_cma_reserve+0xdc/0x128
> > [ 0.002968] ? flush_tlb_one_kernel+0xc/0x20
> > [ 0.002969] ? native_set_fixmap+0x82/0xd0
> > [ 0.002971] ? flat_get_apic_id+0x5/0x10
> > [ 0.002973] ? register_lapic_address+0x8e/0x97
> > [ 0.002975] ? setup_arch+0x8a5/0xc3f
> > [ 0.002978] ? start_kernel+0x66/0x547
> > [ 0.002980] ? load_ucode_bsp+0x4c/0xcd
> > [ 0.002982] ? secondary_startup_64_no_verify+0xb0/0xbb
> > [ 0.002986] random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
> >
> > At the same time, the kernel image is protected with memblock_reserve(),
> > so we can just start searching at PAGE_SIZE. In this case the
> > bottom-up allocation has the same chances to success as a top-down
> > allocation, so there is no reason to fallback in the case of a
> > failure. All together it simplifies the logic.
>
> I figure out that it was introduced by
> commit 79442ed189acb ("memblock.c: introduce bottom-up allocation mode")
>
> According to this commit, The purpose of bottom up allocation is to
> allocate memory from the unhotpluggable node.
Hi Wonhyuk,
correct! And it remains this way, we just don't need to skip
all the memory before the kernel_end.
Thanks!