Re: [PATCH v4 2/2] arm64: kprobe: Enable OPTPROBE for arm64

From: Masami Hiramatsu
Date: Mon Nov 29 2021 - 09:37:27 EST


Hi,

On Mon, 29 Nov 2021 14:50:22 +0800
"liuqi (BA)" <liuqi115@xxxxxxxxxx> wrote:

>
>
> On 2021/11/29 13:00, Masami Hiramatsu wrote:
> > On Mon, 29 Nov 2021 09:40:30 +0800
> > "liuqi (BA)" <liuqi115@xxxxxxxxxx> wrote:
> >
> >>
> >>
> >> On 2021/11/27 20:23, Masami Hiramatsu wrote:
> >>> On Fri, 26 Nov 2021 18:31:06 +0800
> >>> "liuqi (BA)" <liuqi115@xxxxxxxxxx> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 2021/8/24 18:50, Mark Rutland wrote:
> >>>>>> diff --git a/arch/arm64/kernel/probes/optprobe_trampoline.S b/arch/arm64/kernel/probes/optprobe_trampoline.S
> >>>>>> new file mode 100644
> >>>>>> index 000000000000..24d713d400cd
> >>>>>> --- /dev/null
> >>>>>> +++ b/arch/arm64/kernel/probes/optprobe_trampoline.S
> >>>>>> @@ -0,0 +1,37 @@
> >>>>>> +/* SPDX-License-Identifier: GPL-2.0 */
> >>>>>> +/*
> >>>>>> + * trampoline entry and return code for optprobes.
> >>>>>> + */
> >>>>>> +
> >>>>>> +#include <linux/linkage.h>
> >>>>>> +#include <asm/asm-offsets.h>
> >>>>>> +#include <asm/assembler.h>
> >>>>>> +
> >>>>>> + .global optprobe_template_entry
> >>>>>> +optprobe_template_entry:
> >>>>> Please use SYM_*(); see arch/arm64/kernel/entry-ftrace.S for examples of
> >>>>> how to use that for trampolines.
> >>>>>
> >>>>> This should be:
> >>>>>
> >>>>> SYM_CODE_START(optprobe_template)
> >>>>>
> >>>> Hi all,
> >>>>
> >>>> I meet a problem when I use SYM_CODE_START(optprobe_template) to replace
> >>>> optprobe_template_entry.
> >>>>
> >>>> If SYM_CODE_START is used, all optprobe will share one trampoline space.
> >>>> Under this circumstances, if user register two optprobes, trampoline
> >>>> will be overwritten by the newer one, and this will cause kernel panic
> >>>> when the old optprobe is trigger.
> >>>
> >>> Hm, this is curious, because the template should be copied to the
> >>> trampoline buffer for each optprobe and be modified.
> >>>
> >>>>
> >>>> Using optprobe_template_entry will not have this problem, as each
> >>>> optprobe has its own trampoline space (alloced in get_opinsn_slot()).
> >>>
> >>> Yes, it is designed to do so.
> >>>
> >>> Thank you,
> >>>
> >>
> >> Hi Masami,
> >>
> >> Thanks for your reply. But I also met a problem when using
> >> get_opinsn_slot() to alloc trampoline buffer.
> >>
> >> As module_alloc(like x86) is used to alloc buffer, trampoline is in
> >> module space, so if origin insn is in kernel space, the range between
> >> origin insn and trampoline is out of 128M.
> >>
> >> As module PLT cannot used here, I have no idea to achieve long jump in
> >> this situation. Do you have any good idea?
> >
> Hi Masami,
>
> Thanks so much for your reply.
>
> > One possible solution is to use pre-allocated trampoline space in
> > the text area, as same as ppc64 does.
> > (See arch/powerpc/kernel/optprobes_head.S, it embeds a space at "optinsn_slot")
> >
>
> I find something interesting in arch/powerpc/kernel/optprobes.c, it use
> "optinsn_slot" as a public buffer, and use a static "insn_page_in_use"
> to make sure there is only one optprobe in kernel.
>
> If we use this solution , users could only register one optprobe each
> time. This will also be a limitation for users, what's your opinion
> about this?

No, that is just a memory area for pooling trampoline buffer. So optprobe
can allocate the buffer from that area. Please see kernel/kprobes.c:344.
optprobe allocates "insn_slot" from kprobe_optinsn_slots, which uses
alloc_optinsn_page() to allocate the pool of slots.

Thank you,

>
>
> > Also, the trampoline can be minimized, since what we need is the
> > probed address (and the address of struct optprobe).
> > A single trampoline entry will do the following;
> >
> > 1. push lr and a victim register (here, x0)
> > 2. load the address of optprobe to x0
> > 3. call(br) common-optprobe asm code
> > 4. pop lr and x0
> > 5. jump back to (next to) the original place
> >
> > Here the common-optprobe asm code does;
> >
> > c1. push all registers on the stack (like save_all_base_regs) for making
> > struct pt_regs.
> > c2. set the pt_regs address to x1.
> > c3. call optimized_callback()
> > c4. return
> >
> > Since arm64 will emulate the probed instruction, we can do this.
> > (On the other hand, x86 needs to run the probed insn in trampoline
> > code, it will do that between step 4 and 5)
> >
>
> I'll try to minimize the trampoline according to this,
>
> Thanks,
> Qi
> > The trampoline entry code is just 5 instructions (but may need an
> > immediate value (&optprobe) needs to be embedded).
> >
> > Thank you,
> >
> >>
> >> Thanks,
> >> Qi
> >>
> >>>>
> >>>> So how to reuse SYM_CODE_START in this situation, does anyone has a
> >>>> good idea?
> >>>>
> >>>> Thanks,
> >>>> Qi
> >>>>> ... and note the matching end below.
> >>>>>
> >>>>>> + sub sp, sp, #PT_REGS_SIZE
> >>>>>> + save_all_base_regs
> >>>>>> + /* Get parameters to optimized_callback() */
> >>>>>> + ldr x0, 1f
> >>>>>> + mov x1, sp
> >>>>>> + /* Branch to optimized_callback() */
> >>>>>> + .global optprobe_template_call
> >>>>>> +optprobe_template_call:
> >>>>> SYM_INNER_LABEL(optprobe_template_call, SYM_L_GLOBAL)
> >>>>>
> >>>>> ...and likewise for all the other labels.
> >>>>>
> >>>>>> + nop
> >>>>>> + restore_all_base_regs
> >>>>>> + ldr lr, [sp, #S_LR]
> >>>>>> + add sp, sp, #PT_REGS_SIZE
> >>>>>> + .global optprobe_template_restore_orig_insn
> >>>>>> +optprobe_template_restore_orig_insn:
> >>>>>> + nop
> >>>>>> + .global optprobe_template_restore_end
> >>>>>> +optprobe_template_restore_end:
> >>>>>> + nop
> >>>>>> + .global optprobe_template_end
> >>>>>> +optprobe_template_end:
> >>>>>> + .global optprobe_template_val
> >>>>>> +optprobe_template_val:
> >>>>>> + 1: .long 0
> >>>>>> + .long 0
> >>>>>> + .global optprobe_template_max_length
> >>>>>> +optprobe_template_max_length:
> >>>>> SYM_INNER_LABEL(optprobe_template_end, SYM_L_GLOBAL)
> >>>>> SYM_CODE_END(optprobe_template)
> >>>>>
> >>>>> Thanks,
> >>>>> Mark.
> >>>>>
> >>>>>> --
> >>>
> >>>
> >
> >


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>