Re: [PATCH] perf record: Allow passing perf's own pid to '--filter'

From: pi3orama
Date: Mon Jul 06 2015 - 11:01:26 EST




发自我的 iPhone

> 在 2015年7月6日,下午9:56,Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> 写道:
>
> Em Mon, Jul 06, 2015 at 04:17:31AM +0000, Wang Nan escreveu:
>> This patch allows passing perf's own PID to '--filter' by using
>> '@PERFPID'. This should be useful when system-widely capturing
>> tracepoints events.
>
> Steven, does filters have any special meaning for @?
>
>> Before this patch, when doing something like:
>>
>> # perf record -a -e syscalls:sys_enter_write <cmd>
>>
>> One could easily get result like this:
>>
>> # /tmp/perf report --stdio
>> ...
>> # Overhead Command Shared Object Symbol
>> # ........ ....... .................. ....................
>> #
>> 99.99% perf libpthread-2.18.so [.] __write_nocancel
>> 0.01% ls libc-2.18.so [.] write
>> 0.01% sshd libc-2.18.so [.] write
>> ...
>>
>> Where most events are generated by perf itself.
>>
>> A shell trick can be done to filter perf itself out:
>>
>> # cat << EOF > ./tmp
>>> #!/bin/sh
>>> exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10
>>> EOF
>> # chmod a+x ./tmp
>> # ./tmp
>>
>> However, doing so is user unfriendly.
>>
>> This patch introduces '@PERFPID' placeholder to '--filter' options. Now
>> user is allowed to the above work with:
>>
>> # perf record -e ... --filter="common_pid != @PERFPID' sleep 10
>>
>> Signed-off-by: Wang Nan <wangnan0@xxxxxxxxxx>
>> ---
>> tools/perf/Documentation/perf-record.txt | 1 +
>
> <SNIP>
>
>> +++ b/tools/perf/util/parse-events.c
>> @@ -1175,6 +1175,101 @@ int parse_events_option(const struct option *opt, const char *str,
>> return ret;
>> }
>>
>> +#ifndef PAGE_SIZE
>> +# define PAGE_SIZE 4096
>> +#endif
>
> You can use 'page_size', its available and filled via:
>
> page_size = sysconf(_SC_PAGE_SIZE);
>
> early in perf's main() routine.
>
>> +static int
>> +postproc_filter_append_token(const char *key, char *new_filter,
>> + ssize_t *pspace)
>> +{
>> + if (strcmp(key, "PERFPID") == 0) {
>> + char pid_buf[32];
>> + pid_t self_pid = getpid();
>> +
>> + snprintf(pid_buf, 32, "%d", self_pid);
>
> snprintf(pid_buf, sizeof(pid_buf), "%d", self_pid);
>
>> + strncat(new_filter, pid_buf, *pspace);
>> + *pspace -= strlen(pid_buf);
>> + if (*pspace < 0)
>> + return -1;
>> + return 0;
>> + }
>> +
>> + return -1;
>
> but then, please take a look at my perf/core branch, by coincidence I
> worked on having multiple filters set on a evsel in 'perf trace', where
> at a minimum, the tools' pid is added to the filter for all tracepoints
> used, i.e.:
>
> commom_pid != getpid()
>
> is always present, to avoid a feedback loop, neverending tracing of
> syscalls generated by the tracer itself.
>
> Then, if you use --filter-pids PID-1,PID-2,PID3, it will create an
> expression with that first part, for things like gnome-terminal, xorg,
> etc.
>
> Now we need to keep that in place and if the user uses -e to specify
> which syscalls it wants (or wants filtered out), we need to again
> concatenate with that commom_pid list, so that we call the filter ioctl
> just once, else the kernel returns EEXIST.
>
> Because I needed to append, etc, there are new functions there for
> go on creating such expressions, please use them, its all in my
> perf/core branch, the latest patches.
>
> I.e. having something in the filter expression that gets transformed
> into the tools' pid, I have no problem with that, just curious about
> what would be the best character to signal that a substitution needs to
> be performed, if it is really '@VAR', as my first selection would be
> '$VAR',

$ has special meaning for shell. Using $ in cmdline require users use escaping or '' quoted string. Therefore I believe @ should be better. What do you think?

> but then I haven't looked deeply at ftrace's filter stuff to see
> if it has provision for substitution in the kernel, etc.
>
> Andi also did this at some point, forgot why that wasn't applied at the
> time :-\
>
> For reference:
>
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=f47805a2af3ba83881ca52434bbbc6e9886b72fd
>

OK. I'll rebase this patch on it.

Thank you.

> - Arnaldo
>
>> +}
>> +
>> +static void postproc_filter(struct perf_evsel *evsel)
>> +{
>> + char *at = NULL, *sep = NULL, *old_filter, *new_filter;
>> + ssize_t space;
>> +
>> + if (!evsel->filter)
>> + return;
>> +
>> + old_filter = evsel->filter;
>> + at = strchr(old_filter, '@');
>> + if (!at)
>> + return;
>> +
>> + /*
>> + * See perf_event_set_filter(). Length of a valid filter is
>> + * limited by PAGE_SIZE.
>> + */
>> + new_filter = malloc(PAGE_SIZE);
>> + if (!new_filter) {
>> + fprintf(stderr, "No enough memory for post proc filter '%s'\n",
>> + old_filter);
>> + return;
>> + }
>> + *new_filter = '\0';
>> + space = PAGE_SIZE - 1;
>> +
>> + while (1) {
>> + if (at)
>> + *at = '\0';
>> + strncat(new_filter, old_filter, space);
>> + space -= strlen(old_filter);
>> + if (space < 0)
>> + goto errout;
>> + if (!at)
>> + break;
>> + *at = '@';
>> +
>> + sep = strchr(at + 1, ' ');
>> + if (sep)
>> + *sep = '\0';
>> +
>> + if (postproc_filter_append_token(at + 1, new_filter, &space))
>> + goto errout;
>> +
>> + if (!sep)
>> + break;
>> + *sep = ' ';
>> +
>> + old_filter = sep;
>> + at = strchr(old_filter, '@');
>> + }
>> +
>> + free(evsel->filter);
>> + /*
>> + * It is safe to use new_filter directly. However, try to
>> + * release some memory by strdup() a smaller string and free
>> + * new_filter, which takes a full page.
>> + */
>> + evsel->filter = strdup(new_filter);
>> + if (!evsel->filter)
>> + evsel->filter = new_filter;
>> + else
>> + free(new_filter);
>> + return;
>> +errout:
>> + if (at)
>> + *at = '@';
>> + if (sep)
>> + *sep = ' ';
>> + fprintf(stderr, "Can't post proc filter '%s'\n", evsel->filter);
>> + free(new_filter);
>> +}
>> +
>> int parse_filter(const struct option *opt, const char *str,
>> int unset __maybe_unused)
>> {
>> @@ -1196,6 +1291,7 @@ int parse_filter(const struct option *opt, const char *str,
>> return -1;
>> }
>>
>> + postproc_filter(last);
>> return 0;
>> }
>>
>> --
>> 1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/