Re: perf overlapping maps...

From: Arnaldo Carvalho de Melo
Date: Tue Oct 23 2018 - 15:27:55 EST


Em Tue, Oct 23, 2018 at 11:15:03AM -0700, David Miller escreveu:
> From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Date: Tue, 23 Oct 2018 15:05:03 -0300
> > IIRC this was first done for 'perf record', where we have to stash those
> > events in the perf.data file, to then, later, 'perf report' to process
> > those, so when working on 'perf top', it just reuses that machinery.

> > Sure, with some love and care 'perf top' could do better and update all
> > the data structures directly :-)

> Thanks for the history, it is useful information :)

> > Anyway, have you guys considered tweaking using event->header.misc |=
> > PERF_RECORD_MISC_USER? The kernel leaves that as zero for the
> > PERF_RECORD_FORK it emits:

> I really would like to steer the approach away from using UAPI
> perf_event fields in an internal way.

> I am really very sorry for suggesting such a scheme myself in the
> first place. It really was a bad idea upon much consideration.

> The synthetic fork is not really a fork, it's more like a "create".

> And this fundamental semantic difference is why we have all of these
> issues wrt. handling COMM and parent map inheritance.

> There is also a bunch of non-trivial code to deal with whether we
> synthetically create the child or the parent first, wrt. finding
> thread leaders and parent threads.

> What I'm trying to say is that there is a clean design based solution
> hiding somewhere in here and I'd like to find it :-)

So, this is all because we're trying to recreate things that happened in
the past, for threads we're interested in but couldn't catch the
PERF_RECORD_{FORK,COMM,MMAP} when they originally take place.

Ideally we would recreate them in the exact same order and with the
exact same details, which was kinda what was intended, but as you're
seeing is failing at that in various cases.

Also if we keep using this abstraction, i.e. synthesize in userspace
what the kernel does, generating PERF_RECORD__{FORK,COMM,MMAP}, then
older tools will continue working with perf.data files generated by
a new, fixed up 'perf record'.

And nowadays there are other tools that read perf.data files:

http://code.qt.io/cgit/qt-creator/perfparser.git/
https://doc.qt.io/qtcreator/creator-cpu-usage-analyzer.html

Or do you think we should introduce new record types that deal better
with pre-existing threads/maps?

- Arnaldo