Re: [PATCH 00/10] ftrace: Add register_ftrace_direct()

From: Josh Poimboeuf
Date: Fri Nov 08 2019 - 17:51:18 EST


On Fri, Nov 08, 2019 at 04:28:34PM -0500, Steven Rostedt wrote:
>
> Alexei mentioned that he would like a way to access the ftrace fentry
> code to jump directly to a custom eBPF trampoline instead of using
> ftrace regs caller, as he said it would be faster.
>
> I proposed a new register_ftrace_direct() function that would allow
> this to happen and still work with the ftrace infrastructure. I posted
> a proof of concept patch here:
>
> https://lore.kernel.org/r/20191023122307.756b4978@xxxxxxxxxxxxxxxxxx
>
> This patch series is a more complete version, and the start of the
> actual implementation. I haven't run it through my full test suite but
> it passes my smoke tests and some other custom tests I built.
>
> I also realized that I need to make the sample modules depend on X86_64
> as it has inlined assembly in it that requires that dependency.
>
> This is based on 5.4-rc6 plus the permanent patches that prevent
> a ftrace_ops from being disabled by /proc/sys/kernel/ftrace_enabled
>
> Below is the tree that contains this code.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> ftrace/direct

Here's the fix for the objtool warning:

From: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Subject: [PATCH] ftrace/x86: Tell objtool to ignore nondeterministic ftrace stack layout

Objtool complains about the new ftrace direct trampoline code:

arch/x86/kernel/ftrace_64.o: warning: objtool: ftrace_regs_caller()+0x190: stack state mismatch: cfa1=7+16 cfa2=7+24

Typically, code has a deterministic stack layout, such that at a given
instruction address, the stack frame size is always the same.

That's not the case for the new ftrace_regs_caller() code after it
adjusts the stack for the direct case. Just plead ignorance and assume
it's always the non-direct path. Note this creates a tiny window for
ORC to get confused.

Reported-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
Signed-off-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
---
arch/x86/include/asm/unwind_hints.h | 8 ++++++++
arch/x86/kernel/ftrace_64.S | 12 +++++++++++-
2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index 0bcdb1279361..f5e2eb12cb71 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -86,6 +86,14 @@
UNWIND_HINT sp_offset=\sp_offset
.endm

+.macro UNWIND_HINT_SAVE
+ UNWIND_HINT type=UNWIND_HINT_TYPE_SAVE
+.endm
+
+.macro UNWIND_HINT_RESTORE
+ UNWIND_HINT type=UNWIND_HINT_TYPE_RESTORE
+.endm
+
#else /* !__ASSEMBLY__ */

#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 5d946ab40b52..1c79624a36b2 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -175,6 +175,8 @@ ENTRY(ftrace_regs_caller)
/* Save the current flags before any operations that can change them */
pushfq

+ UNWIND_HINT_SAVE
+
/* added 8 bytes to save flags */
save_mcount_regs 8
/* save_mcount_regs fills in first two parameters */
@@ -249,8 +251,16 @@ GLOBAL(ftrace_regs_call)
1: restore_mcount_regs


+2:
+ /*
+ * The stack layout is nondetermistic here, depending on which path was
+ * taken. This confuses objtool and ORC, rightfully so. For now,
+ * pretend the stack always looks like the non-direct case.
+ */
+ UNWIND_HINT_RESTORE
+
/* Restore flags */
-2: popfq
+ popfq

/*
* As this jmp to ftrace_epilogue can be a short jump
--
2.20.1