Re: [PATCH] arm64: Track no early_pgtable_alloc() for kmemleak

From: Mike Rapoport
Date: Thu Nov 04 2021 - 13:07:01 EST


On Thu, Nov 04, 2021 at 11:56:23AM -0400, Qian Cai wrote:
> After switched page size from 64KB to 4KB on several arm64 servers here,
> kmemleak starts to run out of early memory pool due to a huge number of
> those early_pgtable_alloc() calls:
>
> kmemleak_alloc_phys()
> memblock_alloc_range_nid()
> memblock_phys_alloc_range()
> early_pgtable_alloc()
> init_pmd()
> alloc_init_pud()
> __create_pgd_mapping()
> __map_memblock()
> paging_init()
> setup_arch()
> start_kernel()
>
> Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
> won't be enough for a server with 200GB+ memory. There isn't much
> interesting to check memory leaks for those early page tables and those
> early memory mappings should not reference to other memory. Hence, no
> kmemleak false positives, and we can safely skip tracking those early
> allocations from kmemleak like we did in the commit fed84c785270
> ("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
> introduce complications to automatically scale the value depends on the
> runtime memory size etc. After the patch, the default value of
> DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.
>
> Signed-off-by: Qian Cai <quic_qiancai@xxxxxxxxxxx>
> ---
> arch/arm64/mm/mmu.c | 3 ++-
> include/linux/memblock.h | 1 +
> mm/memblock.c | 10 +++++++---
> 3 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index d77bf06d6a6d..4d3cfbaa92a7 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -96,7 +96,8 @@ static phys_addr_t __init early_pgtable_alloc(int shift)
> phys_addr_t phys;
> void *ptr;
>
> - phys = memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> + phys = memblock_phys_alloc_range(PAGE_SIZE, PAGE_SIZE, 0,
> + MEMBLOCK_ALLOC_PGTABLE);
> if (!phys)
> panic("Failed to allocate page table page\n");
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 7df557b16c1e..de903055b01c 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -390,6 +390,7 @@ static inline int memblock_get_region_node(const struct memblock_region *r)
> #define MEMBLOCK_ALLOC_ANYWHERE (~(phys_addr_t)0)
> #define MEMBLOCK_ALLOC_ACCESSIBLE 0
> #define MEMBLOCK_ALLOC_KASAN 1
> +#define MEMBLOCK_ALLOC_PGTABLE 2
>
> /* We are using top down, so it is safe to use 0 here */
> #define MEMBLOCK_LOW_LIMIT 0
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 659bf0ffb086..13bc56a641c0 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -287,7 +287,8 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
> {
> /* pump up @end */
> if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
> - end == MEMBLOCK_ALLOC_KASAN)
> + end == MEMBLOCK_ALLOC_KASAN ||
> + end == MEMBLOCK_ALLOC_PGTABLE)

I think I'll be better to rename MEMBLOCK_ALLOC_KASAN to, say,
MEMBLOCK_ALLOC_NOKMEMLEAK and use that for both KASAN and page table cases.

But more generally, we are going to hit this again and again.
Couldn't we add a memblock allocation as a mean to get more memory to
kmemleak::mem_pool_alloc()?

> end = memblock.current_limit;
>
> /* avoid allocating the first page */
> @@ -1387,8 +1388,11 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
> return 0;
>
> done:
> - /* Skip kmemleak for kasan_init() due to high volume. */
> - if (end != MEMBLOCK_ALLOC_KASAN)
> + /*
> + * Skip kmemleak for kasan_init() and early_pgtable_alloc() due to high
> + * volume.
> + */
> + if (end != MEMBLOCK_ALLOC_KASAN && end != MEMBLOCK_ALLOC_PGTABLE)
> /*
> * The min_count is set to 0 so that memblock allocated
> * blocks are never reported as leaks. This is because many
> --
> 2.30.2
>

--
Sincerely yours,
Mike.