Re: [PATCH 2/2] perf record: Add --dry-run option to check cmdline options

From: Arnaldo Carvalho de Melo
Date: Mon Jun 20 2016 - 10:39:44 EST


Em Mon, Jun 20, 2016 at 11:29:13AM +0800, Wangnan (F) escreveu:
> On 2016/6/17 0:48, Arnaldo Carvalho de Melo wrote:
> >Em Thu, Jun 16, 2016 at 08:02:41AM +0000, Wang Nan escreveu:
> >>With '--dry-run', 'perf record' doesn't do reall recording. Combine with
> >>llvm.dump-obj option, --dry-run can be used to help compile BPF objects for
> >>embedded platform.
> >So these are nice and have value, but can we have a subcommand to do all
> >this with an expressive name, Something like:

> > perf bpfcc foo.c -o foo

> >or shorter:

> > perf bcc foo.c -o foo

> >Just like one would use gcc or some other compiler to generate something
> >for later use?

> I'll try it today. I thought a subcommand require a bigger feature,
> and wrapping clang is not big enough.

Not really, we may have as many as we like, given that they provide
something useful, like I think is the case here.

Having to edit ~/.perfconfig, create a new section, a variable in it
with a boolean value (at first, just reading the changeset comment, I
thought I had to provide a directory where to store the objects
"dumped"), to then use a tool to record a .c event, but not recording
(use dry-run, which is useful to test the command line, etc), to then
get, on the current directory, the end result looked to me a convoluted
way to ask perf to compile the given .c file into a .o for later use.

Doing:

perf bcc -c foo.c

Looks so much simpler and similar to an existing compile source code
into object file workflow (gcc's, any C compiler) that I think it would
fit in the workflow being discussed really nicely.

> >That if called as:
> >
> > perf bcc foo.c
> >
> >Would default to generating a foo.o file.
> >
> > Then, later, one could use this as a event name, i.e.
> >
> > trace --event foo
> >
> >Would, knowing that there is no event named "foo", look at the current
> >directory (and in some other places perhaps) for a file named "foo" that
> >was a bpf object file to use as it would a foo.c, shortcircuiting the
> >bpf compilation code.
> >If this was done instead:
> >
> > trace --event foo.c
> >
> >And foo.c wasn't present, it would fallback to the behaviour described
> >in the previous paragraph: look for a foo.o or foo bpf object file, etc.
> >
> >What do you think?
>
> I'm not sure how many people can be benified from this feature. The only
> advantage I can understand is we can skip the '.c', '.o' or '.bpf' suffix.
>
> I guess what you really want is introducing something like buildid-cache for
> BPF object. One can compile his/her BPF scriptlets into .o using

Nope, the build id cache is that, a cache, somewhere to store object
files that had samples taken in some previous tool session for later
use.

Sure, we can store bpf .o files there, keyed by its build-id, etc, and
then store in the perf.data file the build-id to get it from the cache,
so that we could re-run that workload without having to go thru the
process of recompiling the .c bpf file, if that can be done (running it
on the same kernel, perhaps on a different machine, etc).

> 'perf bcc' and insert it into cache, then he/her can use the resuling
> object without remembering the path of it.

Well, this looks similar to what we do when we try to find a vmlinux
file, we look at a vmlinux_path, searching for a suitable file that has
the matching build-id, i.e. look at the current directory, then at
/boot/, /lib/modules/`unamr -r`, /usr/lib/debug, build-id cache, etc.

The key here is to be able to register the .o file used in the perf.data
file without copying it, i.e. storing just its build-id in the perf.data
file build-id table.

> About fallback, if user explicitly uses '.o' or '.bpf' as suffix our
> parser can be easier. Technically we need a boundary to split event
> name and configuration. '.c', '.o' and '.bpf' are boundaries. In
> addition, is there any difference between '-e mybpf' and '-e
> mybpf.bpf'? We can define that, when using '-e mybpf' the search path
> whould be the BPF object cache, when using '-e mybpf.bpf' the search
> path is current directory. It is acceptable, but why not make '-e
> mybpf.bpf' search BPF object cache also?

Well there is a namespace issue here, if we say:

perf record -e cycles

then this is well known, we want PERF_TYPE_HARDWARE,
PERF_COUNT_HW_CPU_CYCLES. If we instead use:

perf record -e cycles.c

Then this also is well known, we need to build this somehow, and right
now the only way to do this is to use the llvm/clang infrastructure and
then load it into the kernel via sys_bpf.

If we say:

perf record -e cycles.bpf

Then we don't have anything associated with this and may go on trying to
map it to a PERF_TYPE_HARDWARE, PERF_TYPE_SOFTWARE, etc till we find a
suitable event, i.e. if it doesn't match anything, we would end up
looking at a file in the current directory, figure out it is an ELF file
and that its contents are a BPF proggie, that we would load via sys_bpf,
etc.

But what I was proposing was to stick to what we have now, i.e.

perf record -e cycles.c

Means build and load an eBPF proggie via the clang/llvm infrastructure
and sys_bpf().

But... before doing that, look at the current directory (and the BPF proggie
path, that would include the build-id cache, like we do to find a vmlinux) to
find an object file that matches the cycles.c contents, to avoid having to run
clang/llvm everytime we specify that .c eBPF event, and ultimately to remove
the requirement that we have the clang/llvm tools installed.

I.e. we would calculate a build-id from the .c file contents and then
look at the bpf pre-built proggie object path.

For binaries we can have an ELF section in an object file where we store
the build-id to avoid having to calculate it everytime we need it, for
.c files on filesystems with extended attributes we could use
"user.checksum.sha256" as our build-id :-)

E.g.:

[acme@jouet ~]$ echo Hello, world > hello
[acme@jouet ~]$ setfattr -n user.checksum.sha256 -v `sha256sum hello | cut -d' ' -f1` hello
[acme@jouet ~]$ getfattr -n user.checksum.sha256 hello
# file: hello
user.checksum.sha256="37980c33951de6b0e450c3701b219bfeee930544705f637cd1158b63827bb390"

[acme@jouet ~]$ cat hello
Hello, world
[acme@jouet ~]$ sha256sum hello
37980c33951de6b0e450c3701b219bfeee930544705f637cd1158b63827bb390 hello
[acme@jouet ~]$

Having a .build-id section in the .o bpf file would be nice for that :-)

Note, "hello" above would be our .c bpf file, and that sha256sum would be our build-id that
would allow us to find the right .o file to use for a give .c file, in an environment
_without_ llvm/clang.

If it can't be found, perf would say that no suitable .o file matching that .c file
was found, go build one in your developer machine and then copy it over to the
machine where you want to use it without a llvm/clang environment.

- Arnaldo