Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0

From: Luis R. Rodriguez
Date: Wed May 17 2017 - 12:40:32 EST


On Mon, May 15, 2017 at 05:12:18PM -0700, Kees Cook wrote:
> On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote:
> > On Mon, May 15, 2017 at 3:57 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> >> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote:
> >>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
> >>>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
> >>>>
> >>>> I will try updating my distro package for qemu and see if perhaps its this
> >>>> and for the other odd fork issue I reported [0].
> >>>>
> >>>> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@xxxxxxxxxxxxxx
> >>>
> >>> Yeah nope, using my distribution latest:
> >>>
> >>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
> >>>
> >>> And still both issues are present.
> >>>
> >>> Luis
> >>
> >> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> >> at ffffffffc0288000 via /sys/kernel/debug/kernel_page_tables ?
> >
> > Sure thing.
> >
> > Recompiled with this enabled, new warning:
> >
> > [ 0.891559] x86/mm: Found insecure W+X mapping at address
> > ffffffffc00e4000/0xffffffffc00e4000
> > [ 0.892394] ------------[ cut here ]------------
> > [ 0.892834] WARNING: CPU: 0 PID: 1 at
> > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > [ 0.893674] Modules linked in:
> > [ 0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > 4.12.0-rc1-next-20170515+ #145
> > [ 0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [ 0.895828] task: ffff8ed7fa5ccc80 task.stack: ffffae3900630000
> > [ 0.896403] RIP: 0010:note_page+0x630/0x7e0
> > [ 0.896780] RSP: 0018:ffffae3900633df0 EFLAGS: 00010286
> > [ 0.897271] RAX: 0000000000000051 RBX: ffffae3900633e88 RCX: ffffffff9b456708
> > [ 0.897940] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
> > [ 0.898624] RBP: ffffae3900633e28 R08: 203a6d6d2f363878 R09: 0000000000000165
> > [ 0.899314] R10: ffffae3900633dd8 R11: 736e6920646e756f R12: 0000000000000000
> > [ 0.899987] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
> > [ 0.900629] FS: 0000000000000000(0000) GS:ffff8ed7ffc00000(0000)
> > knlGS:0000000000000000
> > [ 0.901398] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.901908] CR2: 0000000000000000 CR3: 0000000118009000 CR4: 00000000000006f0
> > [ 0.902590] Call Trace:
> > [ 0.902827] ptdump_walk_pgd_level_core+0x3e7/0x490
> > [ 0.903274] ? 0xffffffff9a800000
> > [ 0.903595] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > [ 0.904064] mark_rodata_ro+0xf4/0x100
> > [ 0.904423] ? rest_init+0x80/0x80
> > [ 0.904744] kernel_init+0x2f/0x100
> > [ 0.905068] ret_from_fork+0x2c/0x40
> > [ 0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> > ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> > cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> > b6 fc
> > [ 0.907173] ---[ end trace 878b39cb0c248e66 ]---
> > [ 0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> > And ffffffffc00e4000 is:
> >
> > ---[ Modules ]---
> > 0xffffffffc0000000-0xffffffffc00e4000 912K
> > pte
> > 0xffffffffc00e4000-0xffffffffc00e5000 4K RW
> > GLB x pte
> >
> > In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
> >
> > http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt
>
> ---[ Modules ]---
> 0xffffffffc0000000-0xffffffffc00e4000 912K
> pte
>
> This should be the modules ASLR gap
>
> 0xffffffffc00e4000-0xffffffffc00e5000 4K RW
> GLB x pte
>
> This is part of the same gap, but it's RW+x strangely?
>
> 0xffffffffc00e5000-0xffffffffc00e6000 4K
> pte
>
> This is more of the gap?
>
> 0xffffffffc00e6000-0xffffffffc00fa000 80K ro
> GLB x pte
> 0xffffffffc00fa000-0xffffffffc010c000 72K ro
> GLB NX pte
> 0xffffffffc010c000-0xffffffffc011b000 60K RW
> GLB NX pte
>
> This should be the first loaded module. Can you check that
> 0xffffffffc00e6000 matches the first module in /proc/modules?

Yes, but I had killed that boot session again, so upon my next boot
I had a different layout, the ASLR gap was much larger:

---[ Modules ]---
0xffffffffc0000000-0xffffffffc01b0000 1728K pte
0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
0xffffffffc01b1000-0xffffffffc01b2000 4K pte
0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte

As you can guess if we follow similar pattern the RW hole is the one this boot
warned about:

[ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
[ 1.451280] ------------[ cut here ]------------
[ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[ 1.452499] Modules linked in:
[ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145

I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
on the /proc/modules list but then again /proc/modules does not seem to have a specific
order other than perhaps being pegged into a linked list of modules once they go live,
and it seems its typically output backwards from when that happened, sorting that
by address we get:

root@piggy:~# cat /proc/modules | sort -k 6 | head -3
e1000 143360 0 - Live 0xffffffffc01b2000 (E)
mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)

And this then seems to be the first module loaded:

e1000 143360 0 - Live 0xffffffffc01b2000 (E)

The output of dmesg seems to confirm this as per the list of modules sorted
as per above.

> Something touched the module gap and left is RW+x...

Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.

> Are you able to bisect this?

This issue has been present for a while so since I recall this I might be
able to reduce the number of needed target kernels to bisect. Lemme tinker
a bit and if no clear culprit comes up then will try bisect.

Luis