Re: [RFC PATCH 1/1] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation
From: Nicolai Stange
Date: Tue Apr 23 2019 - 14:15:56 EST
Hi Steven,
many thanks for having a look!
Steven Rostedt <rostedt@xxxxxxxxxxx> writes:
> I just found this in my Inbox, and this looks to be a legit issue.
>
> On Thu, 26 Jul 2018 12:40:29 +0200
> Nicolai Stange <nstange@xxxxxxx> wrote:
>
> You still working on this?
Yes, this still needs to get fixed somehow, preferably at the ftrace
layer.
>
>> With dynamic ftrace, ftrace patches call sites in a three steps:
>> 1. Put a breakpoint at the to be patched location,
>> 2. update call site and
>> 3. finally remove the breakpoint again.
>>
>> Note that the breakpoint ftrace_int3_handler() simply makes execution
>> to skip over the to be patched function.
>>
>> This patching happens in the following circumstances:
>> - the global ftrace_trace_function changes and the call sites at
>> ftrace_call and ftrace_regs_call get patched,
>> - an ftrace_ops' ->func changes and the call site in its trampoline gets
>> patched,
>> - graph tracing gets enabled/disabled and the jump site at
>> ftrace_graph_call gets patched
>> - a mcount site gets converted from nop -> call, call -> nop
>> or call -> call.
>>
>> The latter case, i.e. a mcount call getting redirected, happens for example
>> in a transition from trampolined to not trampolined upon a user enabling
>> function tracing on a live patched function.
>>
>> The ftrace_int3_handler() simply skipping over the mcount callsite is
>> problematic here, because it means that a live patched function gets
>> temporarily reverted to its unpatched original and this breaks live
>> patching consistency.
>>
>> Make ftrace_int3_handler not to skip over the fops invocation, but modify
>> the interrupted control flow to issue a call as needed.
>>
>> For determining what the proper action actually is, modify
>> update_ftrace_func() to take an extra argument, func, to be called if the
>> corresponding breakpoint gets hit. Introduce a new global variable,
>> trace_update_func_dest and let update_ftrace_func() set it. For the special
>> case of patching the jmp insn at ftrace_graph_call, set it to zero and make
>> ftrace_int3_handler() just skip over the insn in this case as before.
>>
>> Because there's no space left above the int3 handler's iret frame to stash
>> an extra call's return address in, this can't be mimicked from the
>> ftrace_int3_handler() itself.
>>
>> Instead, make ftrace_int3_handler() change the iret frame's %rip slot to
>> point to the new ftrace_int3_call_trampoline to be executed immediately
>> after iret.
>>
>> The original %rip gets communicated to ftrace_int3_call_trampoline which
>> can then take proper action. Note that ftrace_int3_call_trampoline can
>> nest, because of NMIs, for example. In order to avoid introducing another
>> stack, abuse %r11 for passing the original %rip. This is safe, because the
>> interrupted context is always at a call insn and according to the x86_64
>> ELF ABI allows %r11 is callee-clobbered. According to the glibc sources,
>> this is also true for the somewhat special mcount/fentry ABI.
>>
>> OTOH, a spare register doesn't exist on i686 and thus, this is approach is
>> limited to x86_64.
>>
>> For determining the call target address, let ftrace_int3_call_trampoline
>> compare ftrace_update_func against the interrupted %rip.
>
> I don't think we need to do the compare.
>
>> If they match, an update_ftrace_func() is currently pending and the
>> call site is either of ftrace_call, ftrace_regs_call or the call insn
>> within a fops' trampoline. Jump to the ftrace_update_func_dest location as
>> set by update_ftrace_func().
>>
>> If OTOH the interrupted %rip doesn't equal ftrace_update_func, then
>> it points to an mcount call site. In this case, redirect control flow to
>> the most generic handler, ftrace_regs_caller, which will then do the right
>> thing.
>>
>> Finally, reading ftrace_update_func and ftrace_update_func_dest from
>> outside of the int3 handler requires some care: inbetween adding and
>> removing the breakpoints, ftrace invokes run_sync() which basically
>> issues a couple of IPIs. Because the int3 handler has interrupts disabled,
>> it orders with run_sync().
>>
>> To extend that ordering to also include ftrace_int3_call_trampoline, make
>> ftrace_int3_handler() disable interrupts within the iret frame returning to
>> it.
>>
>> Store the original EFLAGS.IF into %r11's MSB which, because it represents
>> a kernel address, is redundant. Make ftrace_int3_call_trampoline restore
>> it when done.
>
> This can be made much simpler by making a hardcoded ftrace_int3_tramp
> that does the following:
>
> ftrace_int3_tramp
> push %r11
> jmp ftrace_caller
But wouldn't this mean that ftrace_caller could nest if the breakpoint
in question happened to be placed at ftrace_call? Infinite recursion set
aside, the ip value determined from inner calls based on the on-stack
return address would be wrong, AFAICS. Or am I missing something here?
> The ftrace_caller will either call a single ftrace callback, if there's
> only a single ops registered, or it will call the loop function that
> iterates over all the ftrace_ops registered and will call each function
> that matches the ftrace_ops hashes.
>
> In either case, it's what we want.
Ok, under the assumption that my above point is valid, this patch could
still get simplified a lot by having two trampolines:
1.) Your ftrace_int3_tramp from above, to be used if the patched insn is
some mcount call site. The live patching fops will need
ftrace_regs_caller though. So,
ftrace_int3_tramp_regs_caller:
push %r11
jmp ftrace_regs_caller
2.) Another one redirecting control flow to ftrace_ops_list_func(). It's
going to be used when the int3 is found at ftrace_call or
ftrace_regs_call resp. their counterparts in some ftrace_ops'
trampoline.
ftrace_int3_tramp_list_func:
push %r11
jmp ftrace_ops_list_func
ftrace_int3_handler() would then distinguish the following cases:
1.) ip == ftrace_graph_call => ignore, i.e. skip the insn
2.) is_ftrace_caller(ip) => ftrace_int3_tramp_list_func
3.) ftrace_location(ip) => ftrace_int3_tramp_regs_caller
ftrace_location() is getting invoked from ftrace_int3_handler() already,
so there wouldn't be any additional cost.
If that makes sense to you, I'd prepare a patch...
> The ftrace_int3_tramp will simply simulate the call ftrace_caller that
> would be there as the default caller if more than one function is set
> to it.
>
> For 32 bit, we could add 4 variables on the thread_info and make 4
> trampolines, one for each context (normal, softirq, irq and NMI), and
> have them use the variable stored in the thread_info for that location:
>
> ftrace_int3_tramp_normal
> push %eax # just for space
> push %eax
> mov whatever_to_get_thread_info, %eax
> mov normal(%eax), %eax
> mov %eax, 4(%esp)
> pop %eax
> jmp ftrace_caller
>
> ftrace_int3_tramp_softiqr
> push %eax # just for space
> push %eax
> mov whatever_to_get_thread_info, %eax
> mov softirq(%eax), %eax
> mov %eax, 4(%esp)
> pop %eax
> jmp ftrace_caller
>
> etc..
>
>
> A bit crazier for 32 bit but it can still be done. ;-)
Ok, but currently CONFIG_HAVE_LIVEPATCH=n for x86 && !x86_64,
so I'd rather not invest too much energy into screwing 32 bit up ;)
Thanks!
Nicolai
--
SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton,
HRB 21284 (AG NÃrnberg)