Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions

From: Steven Rostedt
Date: Mon May 06 2019 - 16:30:07 EST

On Mon, 6 May 2019 12:46:11 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, May 6, 2019 at 11:57 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > You should have waited another week to open that merge window ;-)
> Hmm. I'm looking at it while the test builds happen, and since I don't
> see what's wrong in the low-level entry code, I'm looking at the
> ftrace code instead.
> What's going on here?
> *pregs = int3_emulate_call(regs, (unsigned
> long)ftrace_regs_caller);
> that line makes no sense. Why would we emulate a call to
> ftrace_regs_caller()? That function sets up a pt_regs, and then calls
> ftrace_stub().

Because that call to ftrace_stub is also dynamic.

In entry_32.S

.globl ftrace_call
call ftrace_stub

} else if (is_ftrace_caller(ip)) {
if (!ftrace_update_func_call) {
int3_emulate_jmp(regs, ip + CALL_INSN_SIZE);
return 1;
*pregs = int3_emulate_call(regs, ftrace_update_func_call);
return 1;

Part of the code will change it to call the function needed directly.

struct ftrace_ops my_ops {
.func = my_handler


Will change "call ftrace_stub" into "call my_handler"

If you register another ftrace_ops, it will change that to

call ftrace_ops_list_func

Which will iterate over all registered ftrace_ops, and depending on the
hashs in ftrace_ops, will call the registered handler for them.

> But we *have* pt_regs here already with the right values. Why isn't
> this just a direct call to ftrace_stub() from within the int3 handler?
> And the thing is, calling ftrace_regs_caller() looks wrong, because
> that's not what happens for *real* mcount handling, which uses that
> "create_trampoline()" to create the thing we're supposed to really
> use?

The ftrace_regs_caller() is what is called if there's two or more
callbacks registered to a single function. For example, you have a
function that is being lived patch (it uses the ftrace_regs_caller copy
of the trampoline). But if you enable function tracing (which doesn't
need a copy of regs), it will call the ftrace_regs_caller, which will
call a ftrace_ops_list_func() which will look at the ftrace_ops (which
is the descriptor representing registered callbacks to ftrace), and
based on the hash value in them, will call their handler if the
ftrace_ops hashes match the function being called.

> Anyway, I simply don't understand the code, so I'm confused. But why
> is the int3 emulation creating a call that doesn't seem to match what
> the instruction that we're actually rewriting is supposed to do?
> IOW, it looks to me like ftrace_int3_handler() is actually emulating
> something different than what ftrace_modify_code() is actually
> modifying the code to do!
> Since the only caller of ftrace_modify_code() is update_ftrace_func(),
> why is that function not just saving the target and we'd emulate the
> call to that? Using anything else looks crazy?
> But as mentioned, I just don't understand the ftrace logic. It looks
> insane to me, and much more likely to be buggy than the very simple
> entry code.

Let's go an example. Let's say we live patched do_IRQ() and
__migrate_task(). We would have this:

(which is a modified copy of the ftrace_regs_caller)
pushl $__KERNEL_CS
pushl 4(%esp)
pushl $0
pushl %gs
pushl %fs
pushl %es
pushl %ds
pushl %eax
popl %eax
movl %eax, 8*4(%esp)
pushl %ebp
pushl %edi
pushl %esi
pushl %edx
pushl %ecx
pushl %ebx
movl 12*4(%esp), %eax
subl $MCOUNT_INSN_SIZE, %eax
movl 15*4(%esp), %edx /* Load parent ip (2nd parameter) */
movl function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
pushl %esp /* Save pt_regs as 4th parameter */

call live_kernel_patch_func

addl $4, %esp /* Skip pt_regs */
push 14*4(%esp)
movl 12*4(%esp), %eax
movl %eax, 14*4(%esp)
popl %ebx
popl %ecx
popl %edx
popl %esi
popl %edi
popl %ebp
popl %eax
popl %ds
popl %es
popl %fs
popl %gs
lea 3*4(%esp), %esp /* Skip orig_ax, ip and cs */
jmp .Lftrace_ret

call live_patch_trampoline


Now we enable function tracing on all functions that can be traced, and
this includes do_IRQ() and __migrate_task(). Thus, we first modify that
call to ftrace_stub in the ftrace_regs_caller to point to the
ftrace_ops_list_func() as that will iterate over the ftrace_ops for
live kernel patching, and the ftrace_ops for the function tracer. That
iterator will check the hashes against the called functions, and for
live kernel patching, it will it will call its handler if the passed in
ip matches either do_IRQ() or __migrate_task(). It will see that the
ftrace_ops for function tracing is set to trace all functions and just
call its handler in that loop too.

Today, when we place an int3 on those functions, we basically turn them
into nops.

<int3>(convert from call live_patch_trampoline
to call ftrace_regs_caller)

But that int3 handler, doesn't call either the live_patch_trampoline or
ftrace_regs_caller, which means, the live kernel patching doesn't get
to make that function call something different. We basically, just
disabled tracing completely for those functions during that transition.

Remember that ftrace_regs_caller gets updated to not call ftrace_stub,
but to the list iterator if there's more than one handler registered
with ftrace (and so does ftrace_caller). By making the int3 handler
call it, will do the iteration over all registered ftrace_ops and
nothing will be missed.

Does that help explain what's going on?

-- Steve