Re: Re: [PATCH v2] kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace

From: Masami Hiramatsu
Date: Tue Feb 03 2015 - 06:52:31 EST


(2015/02/03 20:38), Petr Mladek wrote:
> On Tue 2015-02-03 16:41:39, Masami Hiramatsu wrote:
>> (2015/02/03 2:48), Petr Mladek wrote:
>>> can_probe() checks if the given address points to the beginning of
>>> an instruction. It analyzes all the instructions from the beginning
>>> of the function until the given address. The code might be modified
>>> by another Kprobe. In this case, the current code is read into a buffer,
>>> int3 breakpoint is replaced by the saved opcode in the buffer, and
>>> can_probe() analyzes the buffer instead.
>>>
>>> There is a bug that __recover_probed_insn() tries to restore
>>> the original code even for Kprobes using the ftrace framework.
>>> But in this case, the opcode is not stored. See the difference
>>> between arch_prepare_kprobe() and arch_prepare_kprobe_ftrace().
>>> The opcode is stored by arch_copy_kprobe() only from
>>> arch_prepare_kprobe().
>>>
>>> This patch makes Kprobe to use the ideal 5-byte NOP when the code
>>> can be modified by ftrace. It is the original instruction, see
>>> ftrace_make_nop() and ftrace_nop_replace().
>>>
>>> Note that we always need to use the NOP for ftrace locations. Kprobes
>>> do not block ftrace and the instruction might get modified at anytime.
>>> It might even be in an inconsistent state because it is modified step
>>> by step using the int3 breakpoint.
>>>
>>> The patch also fixes indentation of the touched comment.
>>>
>>> Note that I found this problem when playing with Kprobes. I did it
>>> on x86_64 with gcc-4.8.3 that supported -mfentry. I modified
>>> samples/kprobes/kprobe_example.c and added offset 5 to put
>>> the probe right after the fentry area:
>>>
>>> --- cut ---
>>> static struct kprobe kp = {
>>> .symbol_name = "do_fork",
>>> + .offset = 5,
>>> };
>>> --- cut ---
>>>
>>> Then I was able to load kprobe_example before jprobe_example
>>> but not the other way around:
>>>
>>> $> modprobe jprobe_example
>>> $> modprobe kprobe_example
>>> modprobe: ERROR: could not insert 'kprobe_example': Invalid or incomplete multibyte or wide character
>>>
>>> It did not make much sense and debugging pointed to the bug
>>> described above.
>>>
>>
>> This looks good to me :)
>>
>> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@xxxxxxxxxxx>
>>
>> Ingo, could you merge this as an urgent fix?
>
> Please, wait a bit, see below.
>
>> Thank you!
>>
>>> Signed-off-by: Petr Mladek <pmladek@xxxxxxx>
>>> ---
>>> arch/x86/kernel/kprobes/core.c | 42 ++++++++++++++++++++++++++++--------------
>>> 1 file changed, 28 insertions(+), 14 deletions(-)
>>>
>>> Changes against v1:
>>>
>>> + always use 5-byte NOP for ftrace location
>>> + fix indentation of the touched comment
>>>
>>> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
>>> index 98f654d466e5..2f464b56766a 100644
>>> --- a/arch/x86/kernel/kprobes/core.c
>>> +++ b/arch/x86/kernel/kprobes/core.c
>>> @@ -223,27 +223,41 @@ static unsigned long
>>> __recover_probed_insn(kprobe_opcode_t *buf, unsigned long addr)
>>> {
>>> struct kprobe *kp;
>>> + unsigned long faddr;
>>>
>>> kp = get_kprobe((void *)addr);
>>> - /* There is no probe, return original address */
>>> - if (!kp)
>>> + faddr = ftrace_location(addr);
>
> I have just realized that ftrace_location() might return another
> address if the given one points inside the ftrace_location.
> This situation is not checked by this patch. I am going to work
> on v3.

Ah, I see. But when it happens, something goes wrong, since
__recover_probed_insn() must be called with the address from
where an instruction starts (e.g. function entry or the
instruction boundary.)
So, we just need "BUG_ON(faddr && faddr != addr)" for that
case.

Thank you,

>
> I knew that I should not have sent the patch just before leaving :-(
>
> Best Regards,
> Petr
>
>>> + /*
>>> + * Use the current code if it is not modified by Kprobe
>>> + * and it cannot be modified by ftrace.
>>> + */
>>> + if (!kp && !faddr)
>>> return addr;
>>>
>>> /*
>>> - * Basically, kp->ainsn.insn has an original instruction.
>>> - * However, RIP-relative instruction can not do single-stepping
>>> - * at different place, __copy_instruction() tweaks the displacement of
>>> - * that instruction. In that case, we can't recover the instruction
>>> - * from the kp->ainsn.insn.
>>> + * Basically, kp->ainsn.insn has an original instruction.
>>> + * However, RIP-relative instruction can not do single-stepping
>>> + * at different place, __copy_instruction() tweaks the displacement of
>>> + * that instruction. In that case, we can't recover the instruction
>>> + * from the kp->ainsn.insn.
>>> *
>>> - * On the other hand, kp->opcode has a copy of the first byte of
>>> - * the probed instruction, which is overwritten by int3. And
>>> - * the instruction at kp->addr is not modified by kprobes except
>>> - * for the first byte, we can recover the original instruction
>>> - * from it and kp->opcode.
>>> + * On the other hand, in case on normal Kprobe, kp->opcode has a copy
>>> + * of the first byte of the probed instruction, which is overwritten
>>> + * by int3. And the instruction at kp->addr is not modified by kprobes
>>> + * except for the first byte, we can recover the original instruction
>>> + * from it and kp->opcode.
>>> + *
>>> + * In case of Kprobes using ftrace, we do not have a copy of
>>> + * the original instruction. In fact, the ftrace location might
>>> + * be modified at anytime and even could be in an inconsistent state.
>>> + * Fortunately, we know that the original code is the ideal 5-byte
>>> + * long NOP.
>>> */
>>> - memcpy(buf, kp->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
>>> - buf[0] = kp->opcode;
>>> + memcpy(buf, (void *)addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
>>> + if (faddr)
>>> + memcpy(buf, ideal_nops[NOP_ATOMIC5], MCOUNT_INSN_SIZE);
>>> + else
>>> + buf[0] = kp->opcode;
>>> return (unsigned long)buf;
>>> }
>>>
>>>
>>
>>
>> --
>> Masami HIRAMATSU
>> Software Platform Research Dept. Linux Technology Research Center
>> Hitachi, Ltd., Yokohama Research Laboratory
>> E-mail: masami.hiramatsu.pt@xxxxxxxxxxx
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@xxxxxxxxxxx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/