Re: [PATCH 1/2] kprobes: propagate error from arm_kprobe_ftrace()

From: Masami Hiramatsu
Date: Thu Oct 05 2017 - 02:24:02 EST


Hi Jessica,

On Wed, 4 Oct 2017 21:14:13 +0200
Jessica Yu <jeyu@xxxxxxxxxx> wrote:

> Improve error handling when arming ftrace-based kprobes. Specifically, if
> we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe()
> should report an error instead of success. Previously, this has lead to
> confusing situations where register_kprobe() would return 0 indicating
> success, but the kprobe would not be functional if ftrace registration
> during the kprobe arming process had failed. We should therefore take any
> errors returned by ftrace into account and propagate this error so that we
> do not register/enable kprobes that cannot be armed. This can happen if,
> for example, register_ftrace_function() finds an IPMODIFY conflict (since
> kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict
> is possible since livepatches also set the IPMODIFY flag for their ftrace_ops.
>
> arm_all_kprobes() keeps its current behavior and attempts to arm all
> kprobes. It returns the last encountered error and gives a warning if
> not all kprobes could be armed.
>
> This patch is based on Petr Mladek's original patchset (patches 2 and 3)
> back in 2015, which improved kprobes error handling, found here:
>
> https://lkml.org/lkml/2015/2/26/452
>
> However, further work on this had been paused since then and the patches
> were not upstreamed.

Ok, I have some comment. See below.

>
> Based-on-patches-by: Petr Mladek <pmladek@xxxxxxxx>
> Signed-off-by: Jessica Yu <jeyu@xxxxxxxxxx>
> ---
> kernel/kprobes.c | 87 +++++++++++++++++++++++++++++++++++++++-----------------
> 1 file changed, 61 insertions(+), 26 deletions(-)
>
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 2d28377a0e32..6e889be0d93c 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -979,18 +979,27 @@ static int prepare_kprobe(struct kprobe *p)
> }
>
> /* Caller must lock kprobe_mutex */
> -static void arm_kprobe_ftrace(struct kprobe *p)
> +static int arm_kprobe_ftrace(struct kprobe *p)
> {
> - int ret;
> + int ret = 0;
>
> ret = ftrace_set_filter_ip(&kprobe_ftrace_ops,
> (unsigned long)p->addr, 0, 0);
> - WARN(ret < 0, "Failed to arm kprobe-ftrace at %p (%d)\n", p->addr, ret);
> - kprobe_ftrace_enabled++;
> - if (kprobe_ftrace_enabled == 1) {
> + if (WARN(ret < 0, "Failed to arm kprobe-ftrace at %p (%d)\n", p->addr, ret))
> + return ret;
> +
> + if (kprobe_ftrace_enabled == 0) {
> ret = register_ftrace_function(&kprobe_ftrace_ops);
> - WARN(ret < 0, "Failed to init kprobe-ftrace (%d)\n", ret);
> + if (WARN(ret < 0, "Failed to init kprobe-ftrace (%d)\n", ret))
> + goto err_ftrace;
> }
> +
> + kprobe_ftrace_enabled++;
> + return ret;
> +
> +err_ftrace:
> + ftrace_set_filter_ip(&kprobe_ftrace_ops, (unsigned long)p->addr, 1, 0);
> + return ret;
> }
>
> /* Caller must lock kprobe_mutex */
> @@ -1009,22 +1018,23 @@ static void disarm_kprobe_ftrace(struct kprobe *p)
> }
> #else /* !CONFIG_KPROBES_ON_FTRACE */
> #define prepare_kprobe(p) arch_prepare_kprobe(p)
> -#define arm_kprobe_ftrace(p) do {} while (0)
> +#define arm_kprobe_ftrace(p) (0)
> #define disarm_kprobe_ftrace(p) do {} while (0)
> #endif
>
> /* Arm a kprobe with text_mutex */
> -static void arm_kprobe(struct kprobe *kp)
> +static int arm_kprobe(struct kprobe *kp)
> {
> - if (unlikely(kprobe_ftrace(kp))) {
> - arm_kprobe_ftrace(kp);
> - return;
> - }
> + if (unlikely(kprobe_ftrace(kp)))
> + return arm_kprobe_ftrace(kp);
> +
> cpus_read_lock();
> mutex_lock(&text_mutex);
> __arm_kprobe(kp);
> mutex_unlock(&text_mutex);
> cpus_read_unlock();
> +
> + return 0;
> }
>
> /* Disarm a kprobe with text_mutex */
> @@ -1363,9 +1373,14 @@ static int register_aggr_kprobe(struct kprobe *orig_p, struct kprobe *p)
>
> if (ret == 0 && kprobe_disabled(ap) && !kprobe_disabled(p)) {
> ap->flags &= ~KPROBE_FLAG_DISABLED;
> - if (!kprobes_all_disarmed)
> + if (!kprobes_all_disarmed) {
> /* Arm the breakpoint again. */
> - arm_kprobe(ap);
> + ret = arm_kprobe(ap);
> + if (ret) {
> + ap->flags |= KPROBE_FLAG_DISABLED;
> + list_del_rcu(&p->list);

Nice catch :) this list_del_rcu() is important to keep error case
behavior sane.

> + }
> + }
> }
> return ret;
> }
> @@ -1570,13 +1585,16 @@ int register_kprobe(struct kprobe *p)
> if (ret)
> goto out;
>
> + if (!kprobes_all_disarmed && !kprobe_disabled(p)) {
> + ret = arm_kprobe(p);
> + if (ret)
> + goto out;
> + }
> +

No, this is no good. It is a small chance to hit kprobe on other
CPUs before adding it to kprobe_table hashlist. In that case,
we will see a stray breakpoint instruction.

> INIT_HLIST_NODE(&p->hlist);
> hlist_add_head_rcu(&p->hlist,
> &kprobe_table[hash_ptr(p->addr, KPROBE_HASH_BITS)]);
>
> - if (!kprobes_all_disarmed && !kprobe_disabled(p))
> - arm_kprobe(p);
> -

So, you'll have to rollback by hlist_del_rcu() here.
Hmm, by the way, in this case, you also have to add a synchronize_rcu()
in the end of error path, so that user can release kprobe right after
error return of register_kprobe... (I think that's OK because it is not
a hot path)

Thank you,


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>