Re: [PATCH v2 1/2] mm: uninitialized struct page poisoning sanity checking
From: Pavel Tatashin
Date: Tue Mar 13 2018 - 20:39:47 EST
Hi Sasha,
It seems the patch is doing the right thing, and it catches bugs. Here
we access uninitialized struct page. The question is why this happens?
register_mem_sect_under_node(struct memory_block *mem_blk, int nid)
page_nid = get_nid_for_pfn(pfn);
node id is stored in page flags, and since struct page is poisoned,
and the pattern is recognized, the panic is triggered.
Do you have config file? Also, instructions how to reproduce it?
Thank you,
Pasha
On Tue, Mar 13, 2018 at 7:43 PM, Sasha Levin
<Alexander.Levin@xxxxxxxxxxxxx> wrote:
> On Wed, Jan 31, 2018 at 04:02:59PM -0500, Pavel Tatashin wrote:
>>During boot we poison struct page memory in order to ensure that no one is
>>accessing this memory until the struct pages are initialized in
>>__init_single_page().
>>
>>This patch adds more scrutiny to this checking, by making sure that flags
>>do not equal to poison pattern when the are accessed. The pattern is all
>>ones.
>>
>>Since, node id is also stored in struct page, and may be accessed quiet
>>early we add the enforcement into page_to_nid() function as well.
>>
>>Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
>>---
>
> Hey Pasha,
>
> This patch is causing the following on boot:
>
> [ 1.253732] BUG: unable to handle kernel paging request at fffffffffffffffe
> [ 1.254000] PGD 2284e19067 P4D 2284e19067 PUD 2284e1b067 PMD 0
> [ 1.254000] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [ 1.254000] Modules linked in:
> [ 1.254000] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc5-next-20180313 #10
> [ 1.254000] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
> [ 1.254000] RIP: 0010:__dump_page (??:?)
> [ 1.254000] RSP: 0000:ffff881c63c17810 EFLAGS: 00010246
> [ 1.254000] RAX: dffffc0000000000 RBX: ffffea0084000000 RCX: 1ffff1038c782f2b
> [ 1.254000] RDX: 1fffffffffffffff RSI: ffffffff9e160640 RDI: ffffea0084000000
> [ 1.254000] RBP: ffff881c63c17c00 R08: ffff8840107e8880 R09: ffffed0802167a4d
> [ 1.254000] R10: 0000000000000001 R11: ffffed0802167a4c R12: 1ffff1038c782f07
> [ 1.254000] R13: ffffea0084000020 R14: fffffffffffffffe R15: ffff881c63c17bd8
> [ 1.254000] FS: 0000000000000000(0000) GS:ffff881c6ac00000(0000) knlGS:0000000000000000
> [ 1.254000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.254000] CR2: fffffffffffffffe CR3: 0000002284e16000 CR4: 00000000003406e0
> [ 1.254000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1.254000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1.254000] Call Trace:
> [ 1.254000] dump_page (/mm/debug.c:80)
> [ 1.254000] get_nid_for_pfn (/./include/linux/mm.h:900 /drivers/base/node.c:396)
> [ 1.254000] register_mem_sect_under_node (/drivers/base/node.c:438)
> [ 1.254000] link_mem_sections (/drivers/base/node.c:517)
> [ 1.254000] topology_init (/./include/linux/nodemask.h:271 /arch/x86/kernel/topology.c:164)
> [ 1.254000] do_one_initcall (/init/main.c:835)
> [ 1.254000] kernel_init_freeable (/init/main.c:901 /init/main.c:909 /init/main.c:927 /init/main.c:1076)
> [ 1.254000] kernel_init (/init/main.c:1004)
> [ 1.254000] ret_from_fork (/arch/x86/entry/entry_64.S:417)
> [ 1.254000] Code: ff a8 01 4c 0f 44 f3 4d 85 f6 0f 84 31 0e 00 00 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 2d 11 00 00 <49> 83 3e ff 0f 84 a9 06 00 00 4d 8d b7 c0 fd ff ff 48 b8 00 00
> All code
> ========
> 0: ff a8 01 4c 0f 44 ljmp *0x440f4c01(%rax)
> 6: f3 4d 85 f6 repz test %r14,%r14
> a: 0f 84 31 0e 00 00 je 0xe41
> 10: 4c 89 f2 mov %r14,%rdx
> 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
> 1a: fc ff df
> 1d: 48 c1 ea 03 shr $0x3,%rdx
> 21: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
> 25: 0f 85 2d 11 00 00 jne 0x1158
> 2b:* 49 83 3e ff cmpq $0xffffffffffffffff,(%r14) <-- trapping instruction
> 2f: 0f 84 a9 06 00 00 je 0x6de
> 35: 4d 8d b7 c0 fd ff ff lea -0x240(%r15),%r14
> 3c: 48 rex.W
> 3d: b8 .byte 0xb8
> ...
>
> Code starting with the faulting instruction
> ===========================================
> 0: 49 83 3e ff cmpq $0xffffffffffffffff,(%r14)
> 4: 0f 84 a9 06 00 00 je 0x6b3
> a: 4d 8d b7 c0 fd ff ff lea -0x240(%r15),%r14
> 11: 48 rex.W
> 12: b8 .byte 0xb8
> ...
> [ 1.254000] RIP: __dump_page+0x1c8/0x13c0 RSP: ffff881c63c17810 (/./include/asm-generic/sections.h:42)
> [ 1.254000] CR2: fffffffffffffffe
> [ 1.254000] ---[ end trace e643dfbc44b562ca ]---
>
> --
>
> Thanks,
> Sasha