Re: [RFC] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task

From: Masami Hiramatsu
Date: Thu Apr 09 2020 - 20:32:06 EST


Hi Jiri,

On Thu, 9 Apr 2020 22:13:36 +0200
Jiri Olsa <jolsa@xxxxxxxxxx> wrote:

> On Thu, Apr 09, 2020 at 08:45:01PM +0200, Jiri Olsa wrote:
> > On Thu, Apr 09, 2020 at 11:41:01PM +0900, Masami Hiramatsu wrote:
> >
> > SNIP
> >
> > > > ---
> > > > kernel/kprobes.c | 6 ++++++
> > > > 1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > > > index 2625c241ac00..b13247cae752 100644
> > > > --- a/kernel/kprobes.c
> > > > +++ b/kernel/kprobes.c
> > > > @@ -1236,6 +1236,10 @@ __releases(hlist_lock)
> > > > }
> > > > NOKPROBE_SYMBOL(kretprobe_table_unlock);
> > > >
> > > > +static struct kprobe kretprobe_dummy = {
> > > > + .addr = (void *)kretprobe_trampoline,
> > > > +};
> > > > +
> > > > /*
> > > > * This function is called from finish_task_switch when task tk becomes dead,
> > > > * so that we can recycle any function-return probe instances associated
> > > > @@ -1256,12 +1260,14 @@ void kprobe_flush_task(struct task_struct *tk)
> > > > INIT_HLIST_HEAD(&empty_rp);
> > > > hash = hash_ptr(tk, KPROBE_HASH_BITS);
> > > > head = &kretprobe_inst_table[hash];
> > > > + __this_cpu_write(current_kprobe, &kretprobe_dummy);
> > >
> > > Can you also set the kcb->kprobe_state = KPROBE_HIT_ACTIVE?
> > >
> > > BTW, we may be better to introduce a common kprobe_reject_section_start()
> > > and kprobe_reject_section_end() so that the user don't need to prepare
> > > dummy kprobes.
> >
> > sure, will do
> >
> > thank you both for review
>
> ok, found out it's actually arch code.. would you guys be ok with something like below?

Thanks for update!

>
> jirka
>
>
> ---
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index 4d7022a740ab..081d0f366c99 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -757,12 +757,33 @@ static struct kprobe kretprobe_kprobe = {
> .addr = (void *)kretprobe_trampoline,
> };
>
> +void arch_kprobe_reject_section_start(void)
> +{
> + struct kprobe_ctlblk *kcb;
> +
> + preempt_disable();
> +
> + /*
> + * Set a dummy kprobe for avoiding kretprobe recursion.
> + * Since kretprobe never run in kprobe handler, kprobe must not
> + * be running behind this point.
> + */
> + __this_cpu_write(current_kprobe, &kretprobe_kprobe);
> + kcb = get_kprobe_ctlblk();
> + kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> +}

Yeah, the code seems good to me.

BTW, I rather like make it arch independent function so that other
arch can use it. In this case, the dummy kprobe's addr should be
somewhere obviously blacklisted (but it must be accessible.)
I think get_kprobe() will be a candidate.

And (sorry about changing my mind), the naming, I think kprobe_busy_begin()
and kprobe_busy_end() will be better because it doesn't reject registering
kprobes, instead, just make it busy :)

Thank you,

> +
> +void arch_kprobe_reject_section_end(void)
> +{
> + __this_cpu_write(current_kprobe, NULL);
> + preempt_enable();
> +}
> +
> /*
> * Called from kretprobe_trampoline
> */
> __used __visible void *trampoline_handler(struct pt_regs *regs)
> {
> - struct kprobe_ctlblk *kcb;
> struct kretprobe_instance *ri = NULL;
> struct hlist_head *head, empty_rp;
> struct hlist_node *tmp;
> @@ -772,16 +793,7 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
> void *frame_pointer;
> bool skipped = false;
>
> - preempt_disable();
> -
> - /*
> - * Set a dummy kprobe for avoiding kretprobe recursion.
> - * Since kretprobe never run in kprobe handler, kprobe must not
> - * be running at this point.
> - */
> - kcb = get_kprobe_ctlblk();
> - __this_cpu_write(current_kprobe, &kretprobe_kprobe);
> - kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> + arch_kprobe_reject_section_start();
>
> INIT_HLIST_HEAD(&empty_rp);
> kretprobe_hash_lock(current, &head, &flags);
> @@ -873,8 +885,7 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
>
> kretprobe_hash_unlock(current, &flags);
>
> - __this_cpu_write(current_kprobe, NULL);
> - preempt_enable();
> + arch_kprobe_reject_section_end();
>
> hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
> hlist_del(&ri->hlist);
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 2625c241ac00..935dd8c87705 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1236,6 +1236,14 @@ __releases(hlist_lock)
> }
> NOKPROBE_SYMBOL(kretprobe_table_unlock);
>
> +void __weak arch_kprobe_reject_section_start(void)
> +{
> +}
> +
> +void __weak arch_kprobe_reject_section_end(void)
> +{
> +}
> +
> /*
> * This function is called from finish_task_switch when task tk becomes dead,
> * so that we can recycle any function-return probe instances associated
> @@ -1253,6 +1261,8 @@ void kprobe_flush_task(struct task_struct *tk)
> /* Early boot. kretprobe_table_locks not yet initialized. */
> return;
>
> + arch_kprobe_reject_section_start();
> +
> INIT_HLIST_HEAD(&empty_rp);
> hash = hash_ptr(tk, KPROBE_HASH_BITS);
> head = &kretprobe_inst_table[hash];
> @@ -1266,6 +1276,8 @@ void kprobe_flush_task(struct task_struct *tk)
> hlist_del(&ri->hlist);
> kfree(ri);
> }
> +
> + arch_kprobe_reject_section_end();
> }
> NOKPROBE_SYMBOL(kprobe_flush_task);
>
>


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>