Re: [PATCH RFC] hist lookups

From: Arnaldo Carvalho de Melo
Date: Wed Nov 07 2018 - 15:28:21 EST


Em Wed, Nov 07, 2018 at 12:01:54PM -0800, David Miller escreveu:
> From: Jiri Olsa <jolsa@xxxxxxxxxx>
> Date: Wed, 7 Nov 2018 20:43:44 +0100
>
> > I pushed new version in my perf/fixes branch
>
> Thanks, I'll check it out later today for sure! This is pretty exciting
> work.
>
> Just some random thoughts as I've been thinking about this whole
> situation a lot lately:
>
> Something to consider might be consolidating all of the event rings
> into one. This would force perf to process all events in "system
> order", ie. what order they actually occurred in the machine.

Processing in "system order" is what we want, yeah.

> Yes, this means more contention for the entities inside the kernel
> queueing up the events, however the benefits are enormous.
>
> Right now we go forwards and backwards in time as we move from one
> event ring to another, as you know.
>
> However, we have to reconcile with the need we have to separate "high
> priority" (ie. cannot really lose) events like fork, mmap2, etc. with
> "low priority" ones such as IP samples.

So perhaps we should tell the kernel that is ok to lose SAMPLEs but not
the other events, and make userspace ask for PERF_RECORD_!SAMPLE in all
ring buffers? Duplication wouldn't be that much of a problem?

> Perhaps another way to think about this is to go to the one huge mmap
> ring model, and do the prioritization internally in perf.
>
> Actually, this opens up tons of possibilities in my mind.
>
> Perf can queue to an internal high priority queue for fork and mmap2
> events, and never drop them. Whilst at the same time queueing low

Right, or put that in all queues? Would that be too costly?

> priority events like IP samples into a low priority queue and dropping
> with whatever policy it wants when overloaded (f.e. drop older events
> before newer ones).
>
> I do not like the overwrite ring buffer mode that was implemented
> because it enforeces an entire set of policy decisions upon the user.
> Either the model works for you (which it currently does not for perf)
> or you can't use it at all.
>
> If the issue is that newer events are more interesting than old ones,
> that is entirely perf's businness. And it can implement this policy
> %100 internally to itself. No kernel changes were ever needed to do
> this, as explained above.
>
> Please, let's abandon this whole overwrite mode of the ring buffer.
> The old one works perfectly fine, we just have to use it properly.
> We should never have to shut off kernel side event queueing just
> because we are processing the event ring on the user side.
>
> Thanks.