Re: [PATCH 35/40] perf record: Synthesize COMM event for a command line workload

From: Arnaldo Carvalho de Melo
Date: Tue May 19 2015 - 12:30:50 EST


Em Wed, May 20, 2015 at 12:12:45AM +0900, Namhyung Kim escreveu:
> On Tue, May 19, 2015 at 11:02:20AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, May 19, 2015 at 04:46:43PM +0900, Namhyung Kim escreveu:
> > > On Mon, May 18, 2015 at 09:45:35AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Mon, May 18, 2015 at 09:30:50AM +0900, Namhyung Kim escreveu:
> > > > > When perf creates a new child to profile, the events are enabled on
> > > > > exec(). And in this case, it doesn't synthesize any event for the
> > > > > child since they'll be generated during exec(). But there's an window
> > > > > between the enabling and the event generation.
> > > > >
> > > > > It used to be overcome since samples are only in kernel (so we always
> > > > > have the map) and the comm is overridden by a later COMM event.
> > > > > However it won't work anymore since those samples will go to a missing
> > > > > thread now but the COMM event will create a (current) thread. This
> > > > > leads to those early samples (like native_write_msr_safe) not having a
> > > > > comm but pid (like ':15328').
> > > >
> > > > > So it needs to synthesize COMM event for the child explicitly before
> > > > > enabling so that it can have a correct comm. But at this time, the
> > > > > comm will be "perf" since it's not exec-ed yet.
> > > >
> > > > This looks reasonable, but I think it probably needs to be done
> > > > somewhere in perf_evlist__prepare_workload() or
> > > > perf_evlist__start_workload(), as this affects other tools as well, like
> > > > 'top', 'trace' and any other that may want to do this start-workload use
> > > > case.
> > >
> > > Hmm.. I need to look at this again as it only affects on processing
> > > indexed data files which used to have a separate missing threads tree.
> >
> > Humm, you're thinking about where you managed to reproduce the problem,
> > I am thinking outside indexing, etc, i.e. by definition we either enable
> > the event before we fork, so that we get the PERF_RECORD_FORK/COMM or we
> > synthesize it either from /proc or directly (preferred) if we decide to
> > do it after the fork/exec, right?
>
> But as I said before, later COMM event will override thread->comm to a

But what happens for samples that take place before this "later COMM
event"? Even tho we know the COMM (we're starting the workload) we will
show just the pid, right?

Think about processing the samples with minimum delay (perf trace),
ordered by timestamp, but only ordering the head of the ring buffers,
not with a perf.data file.

- Arnaldo

> proper string as long as it can find a matching thread. So I think it
> has no problem in the current code.
>
> In the old version of this patchset (v3), indexing made it impossible
> for COMM event to find a matching thread since it used to have a
> separate tree for threads that have sampled before any FORK/COMM event
> came. I think it doesn't apply to the current version anymore, I
> will check it tomorrow.
>
> Thanks,
> Namhyung
>
>
> >
> > - Arnaldo
> >
> > > That's the reason why I didn't put it in a generic place like you
> > > said.
> > >
> > > However I changed not to use the separate tree - the purpose of the
> > > tree was to reduce lock acquisition on thread searching but it already
> > > grabs a rwlock with thread refcounting change.
> > >
> > > Will check whether this is still needed..
> > >
> > > Thanks,
> > > Namhyung
> > >
> > >
> > > >
> > > > I also wonder if we can't overcome this without using /proc, i.e.
> > > > actually moving the "start the workload" to just before the fork, so
> > > > that the kernel covers that as well.
> > > >
> > > > Or, alternatively, the thread can be created without having to look at
> > > > /proc at all, but by directly creating the struct thread, with the
> > > > correct COMM, pid, etc, that we know, since we forked it, etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/