Re: perf overlapping maps...

From: Arnaldo Carvalho de Melo
Date: Tue Oct 23 2018 - 14:05:11 EST


Em Tue, Oct 23, 2018 at 10:54:05AM -0700, David Miller escreveu:
> From: Jiri Olsa <jolsa@xxxxxxxxxx>
> Date: Tue, 23 Oct 2018 08:34:52 +0200
>
> > I'm not sure about using the misc field bit defined/used by userland,
> > in case there's some new one comming in future for fork event..
> >
> > but the only other way I can think of now is adding new 'user' event
> > for that, but that ended up as a bigger change (attached)
> >
> > I think if we make some 'big enough' comment about the bit usage,
> > your change is better.. will you post or should I?
>
> There might be something else we can do to implement this, and I think
> making a whole new event for what is an application internal problem
> is overkill.

agreed, I saw this earlier today and thought about "use cpumode" but got
sidetracked with processing other patches :-\ see below.

> What is kind of silly about how all of the synthetic events work is
> that we throw away a lot of information by tossing the events over to
> the generic event processing engine of the perf tool.
>
> So we generate the events knowing the thread, context, PID, cpu, etc.
> and then we lose all of that information, and the event processing
> engine has to look all of it up again.
>
> This is also, BTW, the reason we have dependencies on synthetic event
> emission ordering. F.e. this comes up wrt. COMM and FORK events.
>
> I understand that this design allows the perf tool types to define a
> private function to dispatch the events, as is appropriate for what
> the tool is doing.
>
> But the side effect of this design is that it means it is hard to pass
> internal state around, outside of the event object itself.
>
> Anyways, I'll look into this and see if there is a better way to
> implement this.

IIRC this was first done for 'perf record', where we have to stash those
events in the perf.data file, to then, later, 'perf report' to process
those, so when working on 'perf top', it just reuses that machinery.

Sure, with some love and care 'perf top' could do better and update all
the data structures directly :-)

Anyway, have you guys considered tweaking using event->header.misc |=
PERF_RECORD_MISC_USER? The kernel leaves that as zero for the
PERF_RECORD_FORK it emits:

static void perf_event_task(struct task_struct *task,
struct perf_event_context *task_ctx,
int new)
{
struct perf_task_event task_event;

if (!atomic_read(&nr_comm_events) &&
!atomic_read(&nr_mmap_events) &&
!atomic_read(&nr_task_events))
return;

task_event = (struct perf_task_event){
.task = task,
.task_ctx = task_ctx,
.event_id = {
.header = {
.type = new ? PERF_RECORD_FORK : PERF_RECORD_EXIT,
.misc = 0,
.size = sizeof(task_event.event_id),
},
<SNIP>

void perf_event_fork(struct task_struct *task)
{
perf_event_task(task, NULL, 1);
perf_event_namespaces(task);
}

#define PERF_RECORD_MISC_CPUMODE_UNKNOWN (0 << 0)
#define PERF_RECORD_MISC_KERNEL (1 << 0)
#define PERF_RECORD_MISC_USER (2 << 0)

- Arnaldo