Re: [RFC] ftrace: Add support to keep some functions out of ftrace
From: Jiri Olsa
Date: Sun Aug 14 2022 - 11:22:34 EST
On Sat, Aug 13, 2022 at 03:02:52PM -0400, Steven Rostedt wrote:
> On Fri, 12 Aug 2022 23:18:15 +0200
> Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
>
> > the patch below moves the bpf function into sepatate object and switches
> > off the -mrecord-mcount for it.. so the function gets profile call
> > generated but it's not visible to ftrace
> >
> > this seems to work, but it depends on -mrecord-mcount support in gcc and
> > it's x86 specific... other archs seems to use -fpatchable-function-entry,
> > which does not seem to have option to omit symbol from being collected
> > to the section
>
> As I stated. the __mcount_loc section was created by ftrace. It has
> nothing to do with -fpatchable-function-entry. It's just that the archs
> that use that, do not have a gcc that creates the __mcount_loc section.
>
> >
> > disabling specific ftrace symbol with FTRACE_FL_DISABLED flag seems to
> > be easir and generic solution.. I'll send RFC for that
>
> It's not easier.
>
> Here, I have a POC for you and some more history.
>
> The recordmcount.pl Perl script was the first to create the
> __mcount_loc section in all objects that ftrace needed to hook to. It
> did an objdump, found the locations of the calls to mcount, created
> another file that had a section __mcount_loc that referenced all those
> locations. Compiled and relinked that back into the original object.
>
> This was performed on all object files during the build, and had an
> impact on build times. But this is when I also created and introduced
> "make localmodconfig", which shrunk the build times for many people, so
> nobody noticed the build time increase! :-)
>
> Then John Reiser sent me a patch that created recordmcount.c that did
> the same work but directly modified the ELF object files without having
> to run scripts. This got rid of that horrible overhead from the scripts.
>
> Then, the gcc folks decided to be helpful here as well and created the
> --mrecord-mcount option that would create the __mcount_loc section for
> us, where we no longer needed the recordmcount scripts/C program. But
> is not available across the board.
>
> Today, objtool has also got involved, and added an "--mcount" option
> that will create the section too.
I overlooked that objtool is involved as well,
will check on that
>
> Below is a patch that extends yours by adding a NO_MCOUNT_FILES list,
> that you add the object file to and it will prevent the other methods
> from adding an mcount_loc location.
thanks,
jirka
>
> I'm adding the objtool folks to make sure this is fine with them.
> Again, this is a proof of concept, but works. It may need to be cleaned
> a bit before it is final.
>
> -- Steve
>
> Index: linux-trace.git/scripts/Makefile.build
> ===================================================================
> --- linux-trace.git.orig/scripts/Makefile.build
> +++ linux-trace.git/scripts/Makefile.build
> @@ -206,8 +206,10 @@ sub_cmd_record_mcount = perl $(srctree)/
> "$(if $(part-of-module),1,0)" "$(@)";
> recordmcount_source := $(srctree)/scripts/recordmcount.pl
> endif # BUILD_C_RECORDMCOUNT
> -cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \
> +chk_sub_cmd_record_mcount = $(if $(filter $(shell basename $@),$(NO_MCOUNT_FILES)),, \
> $(sub_cmd_record_mcount))
> +cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \
> + $(chk_sub_cmd_record_mcount))
> endif # CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
>
> # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
> Index: linux-trace.git/scripts/Makefile.lib
> ===================================================================
> --- linux-trace.git.orig/scripts/Makefile.lib
> +++ linux-trace.git/scripts/Makefile.lib
> @@ -233,7 +233,8 @@ objtool_args = \
> $(if $(CONFIG_HAVE_JUMP_LABEL_HACK), --hacks=jump_label) \
> $(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=noinstr) \
> $(if $(CONFIG_X86_KERNEL_IBT), --ibt) \
> - $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \
> + $(if $(filter $(shell basename $@),$(NO_MCOUNT_FILES)),, \
> + $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount)) \
> $(if $(CONFIG_UNWINDER_ORC), --orc) \
> $(if $(CONFIG_RETPOLINE), --retpoline) \
> $(if $(CONFIG_SLS), --sls) \
> Index: linux-trace.git/net/core/Makefile
> ===================================================================
> --- linux-trace.git.orig/net/core/Makefile
> +++ linux-trace.git/net/core/Makefile
> @@ -11,10 +11,15 @@ obj-$(CONFIG_SYSCTL) += sysctl_net_core.
> obj-y += dev.o dev_addr_lists.o dst.o netevent.o \
> neighbour.o rtnetlink.o utils.o link_watch.o filter.o \
> sock_diag.o dev_ioctl.o tso.o sock_reuseport.o \
> - fib_notifier.o xdp.o flow_offload.o gro.o
> + fib_notifier.o xdp.o flow_offload.o gro.o \
> + dispatcher.o
>
> obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o
>
> +# remove dispatcher function from ftrace sight
> +CFLAGS_REMOVE_dispatcher.o = -mrecord-mcount
> +NO_MCOUNT_FILES += dispatcher.o
> +
> obj-y += net-sysfs.o
> obj-$(CONFIG_PAGE_POOL) += page_pool.o
> obj-$(CONFIG_PROC_FS) += net-procfs.o
> Index: linux-trace.git/net/core/dispatcher.c
> ===================================================================
> --- /dev/null
> +++ linux-trace.git/net/core/dispatcher.c
> @@ -0,0 +1,3 @@
> +#include <linux/bpf.h>
> +
> +DEFINE_BPF_DISPATCHER(xdp)
> Index: linux-trace.git/net/core/filter.c
> ===================================================================
> --- linux-trace.git.orig/net/core/filter.c
> +++ linux-trace.git/net/core/filter.c
> @@ -11162,8 +11162,6 @@ const struct bpf_verifier_ops sk_lookup_
>
> #endif /* CONFIG_INET */
>
> -DEFINE_BPF_DISPATCHER(xdp)
> -
> void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog)
> {
> bpf_dispatcher_change_prog(BPF_DISPATCHER_PTR(xdp), prev_prog, prog);
>