Re: [RESEND BUG REPORT] System hung! Due to ftrace or KASAN?

From: Zenghui Yu
Date: Sun Jan 20 2019 - 13:33:57 EST


On Sun, Jan 20, 2019 at 12:45 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Sat, Jan 19, 2019 at 5:37 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > On Fri, Jan 18, 2019 at 6:45 PM Zenghui Yu <zenghuiyu96@xxxxxxxxx> wrote:
> > >
> > > Hi, All!
> > >
> > > I compiled the latest kernel and installed it on my old laptop (Hardware name:
> > > Hewlett-Packard HP ProBook 440 G2/2247, BIOS M74 Ver. 01.02 06/17/2014). But
> > > when I tried to enable function tracer via debugfs, the system went _hung_!
> > > Compared with the last kernel compiling process, the only change I have made
> > > is that I enabled KASAN configuration this time.
> > >
> > > Reproducing this issue is simple as below:
> > >
> > > 0. `uname -a` shows:
> > > Linux ubuntuu 5.0.0-rc2+ #9 SMP Fri Jan 18 03:04:01 CST 2019
> > > x86_64 x86_64 x86_64 GNU/Linux
> > >
> > > 1. `make menuconfig` to enable KASAN configuration:
> > > Kernel hacking ---> Memory Debugging --->
> > > KASAN: runtime memory debugger
> > >
> > > compile, install, reboot, then `dmesg | grep kasan` shows:
> > > [ 0.342882] kasan: KernelAddressSanitizer initialized
> > >
> > > 2. enable function tracer
> > > `echo function > /sys/kernel/debug/tracing/current_tracer`
> > > (then my poor laptop was locked and didn't respond to me ...)
> > >
> > > What's more, enabling function graph tracer will suffer from the same problem.
> > > I have no idea about what wrong thing had happened inside kernel --
> > > about ftrace,
> > > or about KASAN? So I report it to you and waiting for your solution!
> > >
> > > I have provided my *.config* file as attachment for those who're interested in
> > > this issue. But sorry for that I can't provide any useful call trace
> > > because the
> > > system went down so quickly.
> > >
> > > P.S. I'm a newcomer for KASAN. If any mis-configuration or
> > > mis-understand, please
> > > fix me up and let me know :).
> >
> > Hi Zenghui,
> >
> > I've tried to reproduce this, but kernel crashes during boot with this
> > config for me.
> > I am commit 2339e91d0e6609e17943a0ab3c3c8c4044760c05, the config is
> > basically yours but updated for newer compiler and with builtin
> > modules:
> > https://gist.githubusercontent.com/dvyukov/9af234617749aa4eada67ba8c2e4f46c/raw/d0e09ddf255962313a82bb552c3fc0d832fa6844/gistfile1.txt
>
> I've commented out the warning locally for now and can reproduce the
> hang. You need this commit, it fixes the hang:
> https://groups.google.com/forum/#!topic/kasan-dev/g8A8PLKCyoE

Thanks Dmitry! I'll try to test this commit tomorrow.

BTW, I have bisect-ed and tested for this issue today. Finally it turned out
that
bffa986c6f80e39d9903015fc7d0d99a66bbf559 is the first bad commit.
So I'm wondering if anywhere need to be fixed in commit bffa986c6f80 ("kasan:
move common generic and tag-based code to common.c").


Thanks!
Zenghui

>
> > TITLE: WARNING in note_page
> > MAINTAINERS: [dave.hansen@xxxxxxxxxxxxxxx luto@xxxxxxxxxx
> > peterz@xxxxxxxxxxxxx tglx@xxxxxxxxxxxxx mingo@xxxxxxxxxx bp@xxxxxxxxx
> > hpa@xxxxxxxxx x86@xxxxxxxxxx linux-kernel@xxxxxxxxxxxxxxx]
> >
> > ------------[ cut here ]------------
> > x86/mm: Found insecure W+X mapping at address native_usergs_sysret64+0x0/0x10
> > WARNING: CPU: 3 PID: 1 at arch/x86/mm/dump_pagetables.c:263 note_wx
> > arch/x86/mm/dump_pagetables.c:262 [inline]
> > WARNING: CPU: 3 PID: 1 at arch/x86/mm/dump_pagetables.c:263
> > note_page+0x800/0xaf0 arch/x86/mm/dump_pagetables.c:302
> > Kernel panic - not syncing: panic_on_warn set ...
> > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #16
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x7b/0xb5 lib/dump_stack.c:113
> > panic+0x18e/0x351 kernel/panic.c:214
> > __warn+0x13c/0x140 kernel/panic.c:571
> > report_bug+0xd7/0x140 lib/bug.c:186
> > fixup_bug.part.11+0x2d/0x60 arch/x86/kernel/traps.c:178
> > fixup_bug arch/x86/include/asm/paravirt.h:776 [inline]
> > do_error_trap+0xb6/0xc0 arch/x86/kernel/traps.c:271
> > do_invalid_op+0x3b/0x50 arch/x86/kernel/traps.c:290
> > invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
> > RIP: 0010:note_wx arch/x86/mm/dump_pagetables.c:262 [inline]
> > RIP: 0010:note_page+0x800/0xaf0 arch/x86/mm/dump_pagetables.c:302
> > Code: 4d 32 00 4c 89 7b 28 48 c7 43 30 00 00 00 00 e9 46 fc ff ff 4c
> > 89 ee 48 c7 c7 a0 35 c5 ae c6 05 02 81 1e 02 01 e8 50 93 01 00 <0f> 0b
> > 48 8b 7d 90 e8 75 4c 32 00 48 8b 43 20 48 89 45 c8 e9 72 f9
> > RSP: 0000:ffff88805d5c7d18 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff88805d5c7e58 RCX: ffffffffae7f11db
> > RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000246
> > RBP: ffff88805d5c7da0 R08: ffffed100bab8f5c R09: ffffed100bab8f5c
> > R10: 0000000000000002 R11: ffffed100bab8f5c R12: 0000000000000000
> > R13: ffffffffae800000 R14: 0000000000000004 R15: 0000000000000000
> > walk_pmd_level arch/x86/mm/dump_pagetables.c:428 [inline]
> > walk_pud_level arch/x86/mm/dump_pagetables.c:459 [inline]
> > walk_p4d_level arch/x86/mm/dump_pagetables.c:484 [inline]
> > ptdump_walk_pgd_level_core+0x566/0x6e0 arch/x86/mm/dump_pagetables.c:552
> > ptdump_walk_user_pgd_level_checkwx+0x4e/0x50 arch/x86/mm/dump_pagetables.c:600
> > pti_finalize+0x27/0xaf arch/x86/mm/pti.c:682
> > kernel_init+0x3e/0x130 init/main.c:1066
> > ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
> > Kernel Offset: 0x2c800000 from 0xffffffff81000000 (relocation range:
> > 0xffffffff80000000-0xffffffffbfffffff)
> > Rebooting in 86400 seconds..