Re: [PATCH v6 7/8] x86: patch all traced function calls using theint3-based framework

From: Petr Mladek
Date: Thu Jan 23 2014 - 09:21:51 EST

Next message: Jiri Olsa: "Re: [PATCH 5/5] perf tools: Show absolute percentage by default"
Previous message: Viresh Kumar: "Re: Is it ok for deferrable timer wakeup the idle cpu?"
In reply to: Petr Mladek: "Re: [PATCH v6 7/8] x86: patch all traced function calls using theint3-based framework"
Next in thread: Steven Rostedt: "Re: [PATCH v6 7/8] x86: patch all traced function calls using theint3-based framework"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 2014-01-22 at 14:20 +0100, Petr Mladek wrote:
> On Wed, 2014-01-15 at 10:47 -0500, Steven Rostedt wrote:
> > On Tue, 10 Dec 2013 16:42:19 +0100
> > Petr Mladek <pmladek@xxxxxxx> wrote:
> >
> > >
> > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> > > index 03abf9abf681..e5c02af3a8cc 100644
> > > --- a/arch/x86/kernel/alternative.c
> > > +++ b/arch/x86/kernel/alternative.c
> > > @@ -613,7 +613,7 @@ void __kprobes text_poke_or_die(void *addr, const void *opcode, size_t len)
> > > * responsible for setting the patched code read-write. This is more effective
> > > * only when you patch many addresses at the same time.
> > > */
> > > -static int text_poke_part(void *addr, const void *opcode, size_t len)
> > > +static int notrace text_poke_part(void *addr, const void *opcode, size_t len)
> >
> > I can understand why you want to not trace any of these, but again, I
> > hate using notrace unless it is absolutely necessary. That's what the
> > set_ftrace_notrace is for. If you don't want to trace it, simply tell
> > the kernel not to.
>
> I removed "notrace" from all locations, except for "poke_int3_handler".
> Then I tried to switch between 7 tracers, enable and disable tracing,
> and repeated this 100 times. It slowed down from:
>
> real 3m25.837s 3m29.280s 3m27.408s
> user 0m0.004s 0m0.004s 0m0.000s
> sys 0m3.532s 0m3.460s 0m3.416s
>
> to
>
> real 5m23.294s 5m24.944s 5m28.056s
> user 0m0.004s 0m0.004s 0m0.004s
> sys 0m3.076s 0m3.180s 0m3.396s
>
> It means a change by 60%. As you know indeed, the reason is that the
> functions used during patching are called via the int3 handler once the
> breakpoint is added.
>
> I thought about patching these functions separately, e.g. using separate
> list or extra filters with two round patching. But I think that it is
> not worth the extra complexity.
>
> I agree that it might be worth to keep the functions traceable. So, if
> the slowness is acceptable for you, I am fine with it as well.

I updated numbers in commit messages for the patchset v7. I am slightly
in doubts now. I wonder if the complex iterator-based text_poke_bp_list
is still worth the effort.

I tried to switch between 7 tracers: blk, branch, function_graph,
wakeup_rt, irqsoff, function, and nop. Every tracer has also been
enabled and disabled. I tested it on idle system with Intel(R) Core(TM)2
Duo CPU E8400 @ 3.00GHz. With 500 cycles I got these times:

1. original code (without the patch set):

real 18m4.545s 18m11.568s 18m14.396s
user 0m0.004s 0m0.008s 0m0.004s
sys 0m17.176s 0m16.940s 0m16.664s

2. with patchset v7 ; notrace used only for poke_int3_handler; the
slowness is caused by heavy using poke_int3_handler during patching

real 27m11.690s 27m18.176s 27m13.844s
user 0m0.008s 0m0.008s 0m0.008s
sys 0m15.916s 0m15.956s 0m15.860s

3. using text_poke_bp instead of text_poke_bp_list; every address
is patched separately; the slowness is caused by heavy using of
run_sync()

real 34m8.042s 34m15.584s 34m5.008s
user 0m0.004s 0m0.004s 0m0.004s
sys 0m16.296s 0m16.268s 0m16.048s

It means that the slowness caused by tracing "text_poke*" stuff is
comparable with the slowness caused by "run_sync()". It might mean that
"text_poke_bp_list" is not worth the added complexity.

Well, the iterator-based implementation is complex but still faster.
Also the many sync calls might be more painful for a busy system that
heavy use of the int3 handler. So, "text_poke_bp_list" still might be
useful.

Best Regards,
Petr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jiri Olsa: "Re: [PATCH 5/5] perf tools: Show absolute percentage by default"
Previous message: Viresh Kumar: "Re: Is it ok for deferrable timer wakeup the idle cpu?"
In reply to: Petr Mladek: "Re: [PATCH v6 7/8] x86: patch all traced function calls using theint3-based framework"
Next in thread: Steven Rostedt: "Re: [PATCH v6 7/8] x86: patch all traced function calls using theint3-based framework"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]