Re: [PATCH v3] perf evsel: Enable ignore_missing_thread for pid option

From: Arnaldo Carvalho de Melo
Date: Wed Dec 13 2017 - 15:54:21 EST


Em Wed, Dec 13, 2017 at 03:01:53PM +0800, Mengting Zhang escreveu:
> While monitoring a multithread process with pid option, perf sometimes
> may return sys_perf_event_open failure with 3(No such process) if any
> of the process's threads die before we open the event. However, we want
> perf continue monitoring the remaining threads and do not exit with error.

Right, I saw this while testing Jin Yao's 'perf stat --per-thread' in
system wide mode, where it enumerates (or tries to) all threads using
/proc, which, of couse, has races and leads to something this patch
claims (I haven't tested 8-)) to fix,

thanks in advance,

- Arnaldo

> Here, the patch enables perf_evsel::ignore_missing_thread for -p option
> to ignore complete failure if any of threads die before we open the event.
> But it may still return sys_perf_event_open failure with 22(Invalid) if we
> monitors several event groups.
>
> sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8
> sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8
> WARNING: Ignored open failure for pid 28962
> sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8
> sys_perf_event_open failed, error -22
>
> That is because when we ignore a missing thread, we change the thread_idx
> without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
> may return a wrong group_fd for the next thread and sys_perf_event_open()
> return with 22.
>
> sys_perf_event_open(){
> ...
> if (group_fd != -1)
> perf_fget_light()//to get corresponding group_leader by group_fd
> ...
> if (group_leader)
> if (group_leader->ctx->task != ctx->task)//should on the same task
> goto err_context
> ...
> }
>
> This patch also fixes this bug by introducing perf_evsel__remove_fd() and
> update_fds to allow removing fds for the missing thread.
>
> Changes since v1:
> - Change group_fd__remove() into a more genetic way without changing code logic
> - Remove redundant condition
>
> Changes since v2:
> - Use a proper function name and add some comment.
> - Multiline comment style fixes.
>
> Signed-off-by: Mengting Zhang <zhangmengting@xxxxxxxxxx>
> ---
> tools/perf/builtin-record.c | 4 ++--
> tools/perf/util/evsel.c | 48 +++++++++++++++++++++++++++++++++++++++++++--
> 2 files changed, 48 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 0032559..36b6213 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1781,8 +1781,8 @@ int cmd_record(int argc, const char **argv)
> goto out;
> }
>
> - /* Enable ignoring missing threads when -u option is defined. */
> - rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX;
> + /* Enable ignoring missing threads when -u/-p option is defined. */
> + rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
>
> err = -ENOMEM;
> if (perf_evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d5fbcf8..a0a3df7 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1596,10 +1596,47 @@ static int __open_attr__fprintf(FILE *fp, const char *name, const char *val,
> return fprintf(fp, " %-32s %s\n", name, val);
> }
>
> +static void perf_evsel__remove_fd(struct perf_evsel *pos,
> + int nr_cpus, int nr_threads,
> + int thread_idx)
> +{
> + for (int cpu = 0; cpu < nr_cpus; cpu++)
> + for (int thread = thread_idx; thread < nr_threads - 1; thread++)
> + FD(pos, cpu, thread) = FD(pos, cpu, thread + 1);
> +}
> +
> +static int update_fds(struct perf_evsel *evsel,
> + int nr_cpus, int cpu_idx,
> + int nr_threads, int thread_idx)
> +{
> + struct perf_evsel *pos;
> + struct perf_evlist *evlist = evsel->evlist;
> +
> + if (cpu_idx >= nr_cpus || thread_idx >= nr_threads)
> + return -EINVAL;
> +
> + evlist__for_each_entry(evlist, pos) {
> + nr_cpus = pos != evsel ? nr_cpus : cpu_idx;
> +
> + perf_evsel__remove_fd(pos, nr_cpus, nr_threads, thread_idx);
> +
> + /*
> + * Since fds for next evsel has not been created,
> + * there is no need to iterate whole event list.
> + */
> + if (pos == evsel)
> + break;
> + }
> + return 0;
> +}
> +
> static bool ignore_missing_thread(struct perf_evsel *evsel,
> + int nr_cpus, int cpu,
> struct thread_map *threads,
> int thread, int err)
> {
> + pid_t ignore_pid = thread_map__pid(threads, thread);
> +
> if (!evsel->ignore_missing_thread)
> return false;
>
> @@ -1615,11 +1652,18 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
> if (threads->nr == 1)
> return false;
>
> + /*
> + * We should remove fd for missing_thread first
> + * because thread_map__remove() will decrease threads->nr.
> + */
> + if (update_fds(evsel, nr_cpus, cpu, threads->nr, thread))
> + return false;
> +
> if (thread_map__remove(threads, thread))
> return false;
>
> pr_warning("WARNING: Ignored open failure for pid %d\n",
> - thread_map__pid(threads, thread));
> + ignore_pid);
> return true;
> }
>
> @@ -1724,7 +1768,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
> if (fd < 0) {
> err = -errno;
>
> - if (ignore_missing_thread(evsel, threads, thread, err)) {
> + if (ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
> /*
> * We just removed 1 thread, so take a step
> * back on thread index and lower the upper
> --
> 1.7.12.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html