Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

From: Luis R. Rodriguez
Date: Wed May 24 2017 - 13:55:15 EST


On Tue, May 23, 2017 at 04:48:50PM +0200, Luis R. Rodriguez wrote:
> On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> > Hi Luis,
> >
> > On Fri, 19 May 2017 19:28:54 +0200
> > "Luis R. Rodriguez" <mcgrof@xxxxxxxxxx> wrote:
> > >
> > > Aha! And the winner is:
> > >
> > > CONFIG_KPROBES_SANITY_TEST
> > >
> > > I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 avoids the WARN.
> > > I also can confirm using the 'echo dump=mem-area > /sys/kernel/debug/kmemleak' yields
> > > the same trace for both of these kernels.
> > >
> > > So -- the above kmemleak hack seems to actually work to seek who owns that page.
> > >
> > > Now to figure out how the hell kernel/test_kprobes.c screws around with things.
> >
> > Ah, that was fixed recently;
> >
> > https://marc.info/?l=linux-kernel&m=149076389011850
> >
> > Note that this patch depends another patch in the series;
> >
> > https://marc.info/?l=linux-kernel&m=149076370111812&w=2
>
> I actually boot tested linux-next tag next-20170519 which carries these
> patches and the WARNING still is there. Please note the issue is with
> CONFIG_KPROBES_SANITY_TEST enabled.
>
> [ 1.025601] x86/mm: Found insecure W+X mapping at address ffffffffc01e7000/0xffffffffc01e7000
> [ 1.026429] ------------[ cut here ]------------
> [ 1.026885] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> [ 1.027711] Modules linked in:
> [ 1.028032] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170519 #151
> [ 1.028788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 1.029928] task: ffff9fd47a5ccc80 task.stack: ffffb6bcc0630000
> [ 1.030509] RIP: 0010:note_page+0x630/0x7e0
> [ 1.030917] RSP: 0000:ffffb6bcc0633df0 EFLAGS: 00010286
> [ 1.031425] RAX: 0000000000000051 RBX: ffffb6bcc0633e88 RCX: ffffffffbb656708
> [ 1.032132] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
> [ 1.032834] RBP: ffffb6bcc0633e28 R08: 203a6d6d2f363878 R09: 0000000000000161
> [ 1.033539] R10: ffffb6bcc0633dd8 R11: 736e6920646e756f R12: 0000000000000000
> [ 1.034235] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
> [ 1.034927] FS: 0000000000000000(0000) GS:ffff9fd47fc80000(0000) knlGS:0000000000000000
> [ 1.035722] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.036290] CR2: ffffb6bcc073c000 CR3: 0000000053209000 CR4: 00000000000006e0
> [ 1.036839] Call Trace:
> [ 1.037034] ptdump_walk_pgd_level_core+0x3e7/0x490
> [ 1.037367] ? 0xffffffffbaa00000
> [ 1.037705] ptdump_walk_pgd_level_checkwx+0x17/0x20
> [ 1.038187] mark_rodata_ro+0xf4/0x100
> [ 1.038559] ? rest_init+0x80/0x80
> [ 1.038890] kernel_init+0x2f/0x100
> [ 1.039235] ret_from_fork+0x2c/0x40
> [ 1.039582] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 73 10 48 c7 c7 f0 3d 3e bb c6 05 f8 eb bc 00 01 48 89 f2 e8 1d 02 12 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 3c ba 3e bb e8 06 02
> [ 1.041416] ---[ end trace e726c1b63e5a81a9 ]---
> [ 1.041872] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> root@piggy:~# echo dump=0xffffffffc01e7000 > /sys/kernel/debug/kmemleak
>
> On dmesg:
>
> May 23 07:44:51 piggy kernel: kmemleak: Object 0xffffffffc01e7000 (size 335):
> May 23 07:44:51 piggy kernel: kmemleak: comm "swapper/0", pid 1, jiffies 4294892451
> May 23 07:44:51 piggy kernel: kmemleak: min_count = 2
> May 23 07:44:51 piggy kernel: kmemleak: count = 2
> May 23 07:44:51 piggy kernel: kmemleak: flags = 0x1
> May 23 07:44:51 piggy kernel: kmemleak: checksum = 0
> May 23 07:44:51 piggy kernel: kmemleak: backtrace:
> May 23 07:44:51 piggy kernel: kmemleak_alloc+0x4a/0xa0
> May 23 07:44:51 piggy kernel: __vmalloc_node_range+0x20c/0x2b0
> May 23 07:44:51 piggy kernel: module_alloc+0x67/0xc0
> May 23 07:44:51 piggy kernel: arch_ftrace_update_trampoline+0xc1/0x240
> May 23 07:44:51 piggy kernel: ftrace_startup+0x92/0x210
> May 23 07:44:51 piggy kernel: register_ftrace_function+0x4b/0x60
> May 23 07:44:51 piggy kernel: arm_kprobe+0x84/0xc0
> May 23 07:44:51 piggy kernel: register_kprobe+0x59c/0x5e0
> May 23 07:44:51 piggy kernel: init_test_probes+0x61/0x560
> May 23 07:44:51 piggy kernel: init_kprobes+0x1ea/0x20d
> May 23 07:44:51 piggy kernel: do_one_initcall+0x52/0x1a0
> May 23 07:44:51 piggy kernel: kernel_init_freeable+0x17d/0x205
> May 23 07:44:51 piggy kernel: kernel_init+0xe/0x100
> May 23 07:44:51 piggy kernel: ret_from_fork+0x2c/0x40
> May 23 07:44:51 piggy kernel: 0xffffffffffffffff

Turns out that Thomas Gleixner's patch from today [0] fixes this as the same
module_alloc() path was the culprit of the issue. Steven Rostedt however just
reported that this patch crashes on his ftracetests, so it would seem we just
need to address that last kink to fix this properly.

We can take this further on, in that other thread.

[0] alpine.DEB.2.20.1705241459480.2201@nanos">https://lkml.kernel.org/r/alpine.DEB.2.20.1705241459480.2201@nanos
[1] https://lkml.kernel.org/r/20170524134728.61a896c9@xxxxxxxxxxxxxxxxx

Luis