RE: [PATCH v2] perf probe: fix kretprobe issue caused by GCC bug
From: Jianlin Lv
Date: Fri Feb 12 2021 - 23:22:08 EST
> -----Original Message-----
> From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Sent: Saturday, February 13, 2021 5:34 AM
> To: Jianlin Lv <Jianlin.Lv@xxxxxxx>
> Cc: peterz@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; Mark Rutland
> <Mark.Rutland@xxxxxxx>; alexander.shishkin@xxxxxxxxxxxxxxx;
> jolsa@xxxxxxxxxx; namhyung@xxxxxxxxxx; nathan@xxxxxxxxxx;
> ndesaulniers@xxxxxxxxxx; mhiramat@xxxxxxxxxx; fche@xxxxxxxxxx;
> irogers@xxxxxxxxxx; sumanthk@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; clang-built-linux@xxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2] perf probe: fix kretprobe issue caused by GCC bug
>
> Em Wed, Feb 10, 2021 at 02:26:46PM +0800, Jianlin Lv escreveu:
> > Perf failed to add kretprobe event with debuginfo of vmlinux which is
> > compiled by gcc with -fpatchable-function-entry option enabled.
> > The same issue with kernel module.
> >
> > Issue:
> >
> > # perf probe -v 'kernel_clone%return $retval'
> > ......
> > Writing event: r:probe/kernel_clone__return _text+599624 $retval
> > Failed to write event: Invalid argument
> > Error: Failed to add events. Reason: Invalid argument (Code: -22)
> >
> > # cat /sys/kernel/debug/tracing/error_log
> > [156.75] trace_kprobe: error: Retprobe address must be an function entry
> > Command: r:probe/kernel_clone__return _text+599624 $retval
> > ^
> >
> > # llvm-dwarfdump vmlinux |grep -A 10 -w 0x00df2c2b
> > 0x00df2c2b: DW_TAG_subprogram
> > DW_AT_external (true)
> > DW_AT_name ("kernel_clone")
> > DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
> > DW_AT_decl_line (2423)
> > DW_AT_decl_column (0x07)
> > DW_AT_prototyped (true)
> > DW_AT_type (0x00dcd492 "pid_t")
> > DW_AT_low_pc (0xffff800010092648)
> > DW_AT_high_pc (0xffff800010092b9c)
> > DW_AT_frame_base (DW_OP_call_frame_cfa)
> >
> > # cat /proc/kallsyms |grep kernel_clone
> > ffff800010092640 T kernel_clone
> > # readelf -s vmlinux |grep -i kernel_clone
> > 183173: ffff800010092640 1372 FUNC GLOBAL DEFAULT 2 kernel_clone
> >
> > # objdump -d vmlinux |grep -A 10 -w \<kernel_clone\>:
> > ffff800010092640 <kernel_clone>:
> > ffff800010092640: d503201f nop
> > ffff800010092644: d503201f nop
> > ffff800010092648: d503233f paciasp
> > ffff80001009264c: a9b87bfd stp x29, x30, [sp, #-128]!
> > ffff800010092650: 910003fd mov x29, sp
> > ffff800010092654: a90153f3 stp x19, x20, [sp, #16]
> >
> > The entry address of kernel_clone converted by debuginfo is
> > _text+599624 (0x92648), which is consistent with the value of
> DW_AT_low_pc attribute.
> > But the symbolic address of kernel_clone from /proc/kallsyms is
> > ffff800010092640.
> >
> > This issue is found on arm64, -fpatchable-function-entry=2 is enabled
> > when CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
> > Just as objdump displayed the assembler contents of kernel_clone, GCC
> > generate 2 NOPs at the beginning of each function.
> >
> > kprobe_on_func_entry detects that (_text+599624) is not the entry
> > address of the function, which leads to the failure of adding kretprobe
> event.
> >
> > ---
> > kprobe_on_func_entry
> > ->_kprobe_addr
> > ->kallsyms_lookup_size_offset
> > ->arch_kprobe_on_func_entry// FALSE
> > ---
>
> Please don't use --- at the start of a line, it is used to separate from the patch
> itself, later down your message.
>
> It causes this:
>
> [acme@five perf]$ am /wb/1.patch
> Traceback (most recent call last):
> File "/home/acme/bin/ksoff.py", line 180, in <module>
> sign_msg(sys.stdin, sys.stdout)
> File "/home/acme/bin/ksoff.py", line 142, in sign_msg
> sob.remove(last_sob[0])
> TypeError: 'NoneType' object is not subscriptable [acme@five perf]$
>
> I'm fixing this by removing that --- markers
>
Sorry for the inconvenience?
Should I commit another version to fix this issue?
Jianlin
> > The cause of the issue is that the first instruction in the compile
> > unit indicated by DW_AT_low_pc does not include NOPs.
> > This issue exists in all gcc versions that support
> > -fpatchable-function-entry option.
> >
> > I have reported it to the GCC community:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
> >
> > Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
> > The kernel compiled with clang does not have this issue.
> >
> > FIX:
> >
> > This GCC issue only cause the registration failure of the kretprobe
> > event which doesn't need debuginfo. So, stop using debuginfo for retprobe.
> > map will be used to query the probe function address.
> >
> > Signed-off-by: Jianlin Lv <Jianlin.Lv@xxxxxxx>
> > ---
> > v2: stop using debuginfo for retprobe, and update changelog.
> > ---
> > tools/perf/util/probe-event.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/tools/perf/util/probe-event.c
> > b/tools/perf/util/probe-event.c index 8eae2afff71a..a59d3268adb0
> > 100644
> > --- a/tools/perf/util/probe-event.c
> > +++ b/tools/perf/util/probe-event.c
> > @@ -894,6 +894,16 @@ static int try_to_find_probe_trace_events(struct
> perf_probe_event *pev,
> > struct debuginfo *dinfo;
> > int ntevs, ret = 0;
> >
> > +/* Workaround for gcc #98776 issue.
> > + * Perf failed to add kretprobe event with debuginfo of vmlinux which
> is
> > + * compiled by gcc with -fpatchable-function-entry option enabled.
> The
> > + * same issue with kernel module. The retprobe doesn`t need
> debuginfo.
> > + * This workaround solution use map to query the probe function
> address
> > + * for retprobe event.
> > + */
> > +if (pev->point.retprobe)
> > +return 0;
> > +
> > dinfo = open_debuginfo(pev->target, pev->nsi, !need_dwarf);
> > if (!dinfo) {
> > if (need_dwarf)
> > --
> > 2.25.1
> >
>
> --
>
> - Arnaldo
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.