[PATCH] mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}

From: Dan Williams
Date: Wed Jan 04 2017 - 19:38:59 EST


Both arch_add_memory() and arch_remove_memory() expect a single threaded
context.

For example, arch/x86/mm/init_64.c::kernel_physical_mapping_init() does
not hold any locks over this check and branch:

if (pgd_val(*pgd)) {
pud = (pud_t *)pgd_page_vaddr(*pgd);
paddr_last = phys_pud_init(pud, __pa(vaddr),
__pa(vaddr_end),
page_size_mask);
continue;
}

pud = alloc_low_page();
paddr_last = phys_pud_init(pud, __pa(vaddr), __pa(vaddr_end),
page_size_mask);

The result is that two threads calling devm_memremap_pages()
simultaneously can end up colliding on pgd initialization. This leads to
crash signatures like the following where the loser of the race
initializes the wrong pgd entry:

BUG: unable to handle kernel paging request at ffff888ebfff0000
IP: [<ffffffff8149e1e6>] memcpy_erms+0x6/0x10
PGD 2f8e8fc067 PUD 0 /* <---- Invalid PUD */
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
CPU: 54 PID: 3818 Comm: systemd-udevd Not tainted 4.6.7+ #13
task: ffff882fac290040 ti: ffff882f887a4000 task.ti: ffff882f887a4000
RIP: 0010:[<ffffffff8149e1e6>] [<ffffffff8149e1e6>] memcpy_erms+0x6/0x10
[..]
Call Trace:
[<ffffffffc0119045>] ? pmem_do_bvec+0x205/0x370 [nd_pmem]
[<ffffffff8145b82a>] ? blk_queue_enter+0x3a/0x280
[<ffffffffc0119418>] pmem_rw_page+0x38/0x80 [nd_pmem]
[<ffffffff812b1c94>] bdev_read_page+0x84/0xb0

Hold the standard memory hotplug mutex over calls to
arch_{add,remove}_memory().

Cc: <stable@xxxxxxxxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Fixes: 41e94a851304 ("add devm_memremap_pages")
Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
---
kernel/memremap.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index b501e390bb34..9ecedc28b928 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -246,7 +246,9 @@ static void devm_memremap_pages_release(struct device *dev, void *data)
/* pages are dead and unused, undo the arch mapping */
align_start = res->start & ~(SECTION_SIZE - 1);
align_size = ALIGN(resource_size(res), SECTION_SIZE);
+ mem_hotplug_begin();
arch_remove_memory(align_start, align_size);
+ mem_hotplug_done();
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
pgmap_radix_release(res);
dev_WARN_ONCE(dev, pgmap->altmap && pgmap->altmap->alloc,
@@ -358,7 +360,9 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
if (error)
goto err_pfn_remap;

+ mem_hotplug_begin();
error = arch_add_memory(nid, align_start, align_size, true);
+ mem_hotplug_done();
if (error)
goto err_add_memory;