Re: [PATCH v3] perf evsel: Enable ignore_missing_thread for pid option

From: Jiri Olsa
Date: Wed Dec 13 2017 - 08:57:05 EST


On Wed, Dec 13, 2017 at 03:01:53PM +0800, Mengting Zhang wrote:
> While monitoring a multithread process with pid option, perf sometimes
> may return sys_perf_event_open failure with 3(No such process) if any
> of the process's threads die before we open the event. However, we want
> perf continue monitoring the remaining threads and do not exit with error.
>
> Here, the patch enables perf_evsel::ignore_missing_thread for -p option
> to ignore complete failure if any of threads die before we open the event.
> But it may still return sys_perf_event_open failure with 22(Invalid) if we
> monitors several event groups.
>
> sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8
> sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8
> WARNING: Ignored open failure for pid 28962
> sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8
> sys_perf_event_open failed, error -22
>
> That is because when we ignore a missing thread, we change the thread_idx
> without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
> may return a wrong group_fd for the next thread and sys_perf_event_open()
> return with 22.
>
> sys_perf_event_open(){
> ...
> if (group_fd != -1)
> perf_fget_light()//to get corresponding group_leader by group_fd
> ...
> if (group_leader)
> if (group_leader->ctx->task != ctx->task)//should on the same task
> goto err_context
> ...
> }
>
> This patch also fixes this bug by introducing perf_evsel__remove_fd() and
> update_fds to allow removing fds for the missing thread.
>
> Changes since v1:
> - Change group_fd__remove() into a more genetic way without changing code logic
> - Remove redundant condition
>
> Changes since v2:
> - Use a proper function name and add some comment.
> - Multiline comment style fixes.

Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>

thanks,
jirka