[PATCH] ftrace/x86: Fix function graph tracer reset path
From: Namhyung Kim
Date: Fri May 13 2016 - 09:54:05 EST
On my system, simply enabling and disabling function graph tracer can
crash the kernel. I don't know how it worked until now.
The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at
ftrace_graph_call assuming it's a 5 bytes near jmp (e9 <offset>).
However it's a short jmp consisting of 2 bytes only (eb <offset>). And
ftrace_stub() is located just below the ftrace_graph_caller so
modification above breaks the instruction resulting in kernel oops on
the ftrace_stub() with the invalid opcode like below:
# cd /sys/kernel/trace
# echo function_graph > current_tracer
# echo nop > current_tracer
[ 78.122055] invalid opcode: 0000 [#1] SMP
[ 78.125125] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel pcspkr iwldvm iwlwifi
[ 78.128241] CPU: 2 PID: 17 Comm: migration/2 Not tainted 4.6.0-rc4+ #36
[ 78.131310] Hardware name: LENOVO 4286A74/4286A74, BIOS 8DET56WW (1.26 ) 12/01/2011
[ 78.134369] task: ffff88040bec4240 ti: ffff88040bee4000 task.ti: ffff88040bee4000
[ 78.137412] RIP: 0010:[<ffffffff818939a8>] [<ffffffff818939a8>] ftrace_stub+0x0/0x8
[ 78.140477] RSP: 0018:ffff88040bee7e48 EFLAGS: 00010246
[ 78.143489] RAX: ffff88040bec4240 RBX: ffffffff8107a7b0 RCX: 000000000000001f
[ 78.146512] RDX: 0000000000000000 RSI: ffff88041e2929d8 RDI: ffff88040bee7e50
[ 78.149581] RBP: ffff88040bee7e80 R08: ffff88040bee4000 R09: 0000000000000000
[ 78.152647] R10: 00000000000318b7 R11: ffff8800d661f800 R12: ffff88040d8011b0
[ 78.155679] R13: ffffffff81e43620 R14: ffff88040bda8588 R15: ffffffff81e503e0
[ 78.158675] FS: 0000000000000000(0000) GS:ffff88041e280000(0000) knlGS:0000000000000000
[ 78.161699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.164690] CR2: 00007fadb22dde1d CR3: 00000000d5ce2000 CR4: 00000000000406e0
[ 78.167691] Stack:
[ 78.170658] ffffffff8110b3ee ffffffff8188de90 00000009161d55f6 00000012306fc2e4
[ 78.173710] 0000000000000000 ffff880400000000 ffff88040bec4240 ffff88040bee7ec8
[ 78.176783] ffffffff81893bbd 0000000000000000 ffff88040bec4240 ffffffff81893ba8
[ 78.179863] Call Trace:
[ 78.182853] [<ffffffff8110b3ee>] ? ftrace_return_to_handler+0x8e/0x100
[ 78.185909] [<ffffffff8188de90>] ? __schedule+0xae0/0xae0
[ 78.188941] [<ffffffff81893bbd>] return_to_handler+0x15/0x27
[ 78.192001] [<ffffffff81893ba8>] ? ftrace_graph_caller+0xa8/0xa8
[ 78.195091] [<ffffffff8107a6f0>] ? sort_range+0x30/0x30
[ 78.198138] [<ffffffff810778a9>] kthread+0xc9/0xe0
[ 78.201143] [<ffffffff81891a12>] ret_from_fork+0x22/0x40
[ 78.204138] [<ffffffff810777e0>] ? kthread_worker_fn+0x170/0x170
[ 78.207129] Code: 8b 44 24 48 48 8b 7c 24 70 48 8b 74 24 68 48 8b 54 24 60
48 8b 4c 24 58 48 8b 44 24 50 48 8b 6c 24 20 48 81 c4 d0
00 00 00 e9 fd <ff> ff ff 80 00 00 00 00 9c 55 ff 74 24 18
55 48 89 e5 ff 74 24
[ 78.213997] RIP [<ffffffff818939a8>] ftrace_stub+0x0/0x8
[ 78.217190] RSP <ffff88040bee7e48>
[ 78.220374] ---[ end trace 0af2f0d9f7301011 ]---
Looking at the code dump, the ftrace_stub instruction was overwritten as
<ff>. Below is disassembly output of related code.
$ objdump -d --start-address=0xffffffff818939a6 --stop-address=0xffffffff818939b0 vmlinux
vmlinux: file format elf64-x86-64
Disassembly of section .text:
ffffffff818939a6 <ftrace_epilogue>:
ffffffff818939a6: eb 00 jmp ffffffff818939a8 <ftrace_stub>
ffffffff818939a8 <ftrace_stub>:
ffffffff818939a8: c3 retq
ffffffff818939a9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
As you can see ftrace_epilogue (same as ftrace_graph_caller) is 2 byte
ahead of ftrace_stub. And it's replaced by a jump to ftrace_stub() by
ftrace_disable_ftrace_graph_caller: "e9 <0xfffffffd>". Pads 3 bytes
after ftrace_epilogue to prevent ftrace_stub from being overwritten.
Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
---
arch/x86/kernel/mcount_64.S | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index ed48a9f465f8..0e6af57a713a 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -180,6 +180,15 @@ GLOBAL(ftrace_epilogue)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
GLOBAL(ftrace_graph_call)
jmp ftrace_stub
+ /*
+ * The above jmp is generated as a short jump which occupies 2 bytes
+ * but ftrace_enable/disable_ftrace_graph_caller() assumes it's a
+ * near jump which occupies 5 bytes so breaks ftrace_stub() below.
+ * Add 3 bytes padding to avoid that.
+ */
+ nop
+ nop
+ nop
#endif
GLOBAL(ftrace_stub)
--
2.8.0