Re: [PATCH] ftrace/x86: Fix function graph tracer reset path

From: Namhyung Kim
Date: Sun May 15 2016 - 09:07:28 EST


On Sun, May 15, 2016 at 09:41:41AM +0900, Masami Hiramatsu wrote:

Hi Masami,

> Hi Namhyung,
>
> I'm interested in this problem, and it seems compiling environment
> related or kconfig related problem.
> If you can reproduce this kernel, would you share what the "AS"
> commandline shows? That can be done as below;
>
> $ make V=1 arch/x86/kernel/mcount_64.o | grep mcount_64

gcc -Wp,-MD,arch/x86/kernel/.mcount_64.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/include -I/home/namhyung/project/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/namhyung/project/linux/include -Iinclude -I/home/namhyung/project/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/namhyung/project/linux/include/uapi -Iinclude/generated/uapi -include /home/namhyung/project/linux/include/linux/kconfig.h -D__KERNEL__ -D__ASSEMBLY__ -m64 -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -mfentry -DCC_USING_FENTRY -DCC_HAVE_ASM_GOTO -c -o arch/x86/kernel/mcount_64.o /home/namhyung/project/linux/arch/x86/kernel/mcount_64.S


$ gcc --version
gcc (GCC) 5.3.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ as --version
GNU assembler (GNU Binutils) 2.26.0.20160302
Copyright (C) 2015 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-pc-linux-gnu'.


Thanks,
Namhyung

>
> Thank you,
>
> On Fri, 13 May 2016 22:53:43 +0900
> Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> > On my system, simply enabling and disabling function graph tracer can
> > crash the kernel. I don't know how it worked until now.
> >
> > The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at
> > ftrace_graph_call assuming it's a 5 bytes near jmp (e9 <offset>).
> > However it's a short jmp consisting of 2 bytes only (eb <offset>). And
> > ftrace_stub() is located just below the ftrace_graph_caller so
> > modification above breaks the instruction resulting in kernel oops on
> > the ftrace_stub() with the invalid opcode like below:
> >
> > # cd /sys/kernel/trace
> > # echo function_graph > current_tracer
> > # echo nop > current_tracer
> >
> > [ 78.122055] invalid opcode: 0000 [#1] SMP
> > [ 78.125125] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel pcspkr iwldvm iwlwifi
> > [ 78.128241] CPU: 2 PID: 17 Comm: migration/2 Not tainted 4.6.0-rc4+ #36
> > [ 78.131310] Hardware name: LENOVO 4286A74/4286A74, BIOS 8DET56WW (1.26 ) 12/01/2011
> > [ 78.134369] task: ffff88040bec4240 ti: ffff88040bee4000 task.ti: ffff88040bee4000
> > [ 78.137412] RIP: 0010:[<ffffffff818939a8>] [<ffffffff818939a8>] ftrace_stub+0x0/0x8
> > [ 78.140477] RSP: 0018:ffff88040bee7e48 EFLAGS: 00010246
> > [ 78.143489] RAX: ffff88040bec4240 RBX: ffffffff8107a7b0 RCX: 000000000000001f
> > [ 78.146512] RDX: 0000000000000000 RSI: ffff88041e2929d8 RDI: ffff88040bee7e50
> > [ 78.149581] RBP: ffff88040bee7e80 R08: ffff88040bee4000 R09: 0000000000000000
> > [ 78.152647] R10: 00000000000318b7 R11: ffff8800d661f800 R12: ffff88040d8011b0
> > [ 78.155679] R13: ffffffff81e43620 R14: ffff88040bda8588 R15: ffffffff81e503e0
> > [ 78.158675] FS: 0000000000000000(0000) GS:ffff88041e280000(0000) knlGS:0000000000000000
> > [ 78.161699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 78.164690] CR2: 00007fadb22dde1d CR3: 00000000d5ce2000 CR4: 00000000000406e0
> > [ 78.167691] Stack:
> > [ 78.170658] ffffffff8110b3ee ffffffff8188de90 00000009161d55f6 00000012306fc2e4
> > [ 78.173710] 0000000000000000 ffff880400000000 ffff88040bec4240 ffff88040bee7ec8
> > [ 78.176783] ffffffff81893bbd 0000000000000000 ffff88040bec4240 ffffffff81893ba8
> > [ 78.179863] Call Trace:
> > [ 78.182853] [<ffffffff8110b3ee>] ? ftrace_return_to_handler+0x8e/0x100
> > [ 78.185909] [<ffffffff8188de90>] ? __schedule+0xae0/0xae0
> > [ 78.188941] [<ffffffff81893bbd>] return_to_handler+0x15/0x27
> > [ 78.192001] [<ffffffff81893ba8>] ? ftrace_graph_caller+0xa8/0xa8
> > [ 78.195091] [<ffffffff8107a6f0>] ? sort_range+0x30/0x30
> > [ 78.198138] [<ffffffff810778a9>] kthread+0xc9/0xe0
> > [ 78.201143] [<ffffffff81891a12>] ret_from_fork+0x22/0x40
> > [ 78.204138] [<ffffffff810777e0>] ? kthread_worker_fn+0x170/0x170
> > [ 78.207129] Code: 8b 44 24 48 48 8b 7c 24 70 48 8b 74 24 68 48 8b 54 24 60
> > 48 8b 4c 24 58 48 8b 44 24 50 48 8b 6c 24 20 48 81 c4 d0
> > 00 00 00 e9 fd <ff> ff ff 80 00 00 00 00 9c 55 ff 74 24 18
> > 55 48 89 e5 ff 74 24
> > [ 78.213997] RIP [<ffffffff818939a8>] ftrace_stub+0x0/0x8
> > [ 78.217190] RSP <ffff88040bee7e48>
> > [ 78.220374] ---[ end trace 0af2f0d9f7301011 ]---
> >
> > Looking at the code dump, the ftrace_stub instruction was overwritten as
> > <ff>. Below is disassembly output of related code.
> >
> > $ objdump -d --start-address=0xffffffff818939a6 --stop-address=0xffffffff818939b0 vmlinux
> >
> > vmlinux: file format elf64-x86-64
> >
> > Disassembly of section .text:
> >
> > ffffffff818939a6 <ftrace_epilogue>:
> > ffffffff818939a6: eb 00 jmp ffffffff818939a8 <ftrace_stub>
> >
> > ffffffff818939a8 <ftrace_stub>:
> > ffffffff818939a8: c3 retq
> > ffffffff818939a9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
> >
> > As you can see ftrace_epilogue (same as ftrace_graph_caller) is 2 byte
> > ahead of ftrace_stub. And it's replaced by a jump to ftrace_stub() by
> > ftrace_disable_ftrace_graph_caller: "e9 <0xfffffffd>". Pads 3 bytes
> > after ftrace_epilogue to prevent ftrace_stub from being overwritten.
> >
> > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> > ---
> > arch/x86/kernel/mcount_64.S | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
> > index ed48a9f465f8..0e6af57a713a 100644
> > --- a/arch/x86/kernel/mcount_64.S
> > +++ b/arch/x86/kernel/mcount_64.S
> > @@ -180,6 +180,15 @@ GLOBAL(ftrace_epilogue)
> > #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > GLOBAL(ftrace_graph_call)
> > jmp ftrace_stub
> > + /*
> > + * The above jmp is generated as a short jump which occupies 2 bytes
> > + * but ftrace_enable/disable_ftrace_graph_caller() assumes it's a
> > + * near jump which occupies 5 bytes so breaks ftrace_stub() below.
> > + * Add 3 bytes padding to avoid that.
> > + */
> > + nop
> > + nop
> > + nop
> > #endif
> >
> > GLOBAL(ftrace_stub)
> > --
> > 2.8.0
> >
>
>
> --
> Masami Hiramatsu <mhiramat@xxxxxxxxxx>