[PATCHv3] mm: page_alloc: fix CMA area initialisation when pageblock > MAX_ORDER

From: Michal Nazarewicz
Date: Mon Jun 23 2014 - 15:40:59 EST


With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE,
the following is triggered at early boot:

SMP: Total of 8 processors activated.
devtmpfs: initialized
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = fffffe0000050000
[00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
Internal error: Oops: 96000006 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
PC is at __list_add+0x10/0xd4
LR is at free_one_page+0x270/0x638
...
Call trace:
[<fffffe00003ee970>] __list_add+0x10/0xd4
[<fffffe000019c478>] free_one_page+0x26c/0x638
[<fffffe000019c8c8>] __free_pages_ok.part.52+0x84/0xbc
[<fffffe000019d5e8>] __free_pages+0x74/0xbc
[<fffffe0000c01350>] init_cma_reserved_pageblock+0xe8/0x104
[<fffffe0000c24de0>] cma_init_reserved_areas+0x190/0x1e4
[<fffffe0000090418>] do_one_initcall+0xc4/0x154
[<fffffe0000bf0a50>] kernel_init_freeable+0x204/0x2a8
[<fffffe00007520a0>] kernel_init+0xc/0xd4

This happens because init_cma_reserved_pageblock() calls
__free_one_page() with pageblock_order as page order but it is bigger
han MAX_ORDER. This in turn causes accesses past zone->free_list[].

Fix the problem by changing init_cma_reserved_pageblock() such that it
splits pageblock into individual MAX_ORDER pages if pageblock is
bigger than a MAX_ORDER page.

In cases where !CONFIG_HUGETLB_PAGE_SIZE_VARIABLE, which is all
architectures expect for ia64, powerpc and tile at the moment, the
âpageblock_order > MAX_ORDERâ condition will be optimised out since
both sides of the operator are constants. In cases where pageblock
size is variable, the performance degradation should not be
significant anyway since init_cma_reserved_pageblock() is called
only at boot time at most MAX_CMA_AREAS times which by default is
eight.

Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Michal Nazarewicz <mina86@xxxxxxxxxx>
Reported-by: Mark Salter <msalter@xxxxxxxxxx>
Tested-by: Christopher Covington <cov@xxxxxxxxxxxxxx>
---
mm/page_alloc.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

Mark Salter wrote:
> I ended up needing this (on top of your patch) to get the system to
> boot. Each MAX_ORDER-1 group needs the refcount and migratetype set
> so that __free_pages does the right thing.
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 02fb1ed..a7ca6cc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -799,17 +799,18 @@ void __init init_cma_reserved_pageblock(struct page *page)
> set_page_count(p, 0);
> } while (++p, --i);
>
> - set_page_refcounted(page);
> - set_pageblock_migratetype(page, MIGRATE_CMA);
> -
> - if (pageblock_order > MAX_ORDER) {
> - i = pageblock_order - MAX_ORDER;
> + if (pageblock_order >= MAX_ORDER) {
> + i = pageblock_order - MAX_ORDER + 1;
> i = 1 << i;
> p = page;
> do {
> - __free_pages(p, MAX_ORDER);
> + set_page_refcounted(p);
> + set_pageblock_migratetype(p, MIGRATE_CMA);
> + __free_pages(p, MAX_ORDER - 1);
> } while (p += MAX_ORDER_NR_PAGES, --i);
> } else {
> + set_page_refcounted(page);
> + set_pageblock_migratetype(page, MIGRATE_CMA);
> __free_pages(page, pageblock_order);
> }

This is kinda embarrassing, dunno how I missed that.

But each page actually does not need to have migratetype set, does it?
All of those pages are in a single pageblock so a single call
suffices. If you track set_pageblock_migratetype down to pfn_to_bitidx
there is:

return (pfn >> pageblock_order) * NR_PAGEBLOCK_BITS;

so for pfns inside of a pageblock, they get truncated. Or did I miss
yet another thing?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ee92384..fef9614 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -816,9 +816,21 @@ void __init init_cma_reserved_pageblock(struct page *page)
set_page_count(p, 0);
} while (++p, --i);

- set_page_refcounted(page);
set_pageblock_migratetype(page, MIGRATE_CMA);
- __free_pages(page, pageblock_order);
+
+ if (pageblock_order >= MAX_ORDER) {
+ i = pageblock_nr_pages;
+ p = page;
+ do {
+ set_page_refcounted(p);
+ __free_pages(p, MAX_ORDER - 1);
+ p += MAX_ORDER_NR_PAGES;
+ } while (i -= MAX_ORDER_NR_PAGES);
+ } else {
+ set_page_refcounted(page);
+ __free_pages(page, pageblock_order);
+ }
+
adjust_managed_page_count(page, pageblock_nr_pages);
}
#endif
--
2.0.0.526.g5318336
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/