Re: [PATCH] perf inject: Flush ordered events on FINISHED_ROUND

From: Jiri Olsa
Date: Sun Oct 04 2020 - 15:52:50 EST


On Fri, Oct 02, 2020 at 10:03:17PM +0900, Namhyung Kim wrote:
> Currently perf inject just repipes the event without any flush. It
> makes an issue that it changes the order of events processed.
>
> Normally we want to process events in time order, but events are
> recorded from several cpus and they weren't sorted with each other.
> So we put them into the ordered event queue, sorted by time, and flush
> them when we see a next FINISHED_ROUND event.
>
> But this is for events from kernel, user space events (like the
> FINISHED_ROUND) are processed without queueing. So during the perf
> inject, it writes all the FINISHED_ROUND events directly while
> SAMPLE (and other) events are kept in the queue. This makes the user
> space events are put before any kernel events.
>
> You can see that with the following command:
>
> $ perf record -a -o- sleep 1 | perf inject -b | perf script -i- --show-round-events
> PERF_RECORD_FINISHED_ROUND
> PERF_RECORD_FINISHED_ROUND
> PERF_RECORD_FINISHED_ROUND
> ...
>
> Omitting perf inject, you can see the events are placed in the middle
> of the data.
>
> You might argue that the whole point of the FINISHED_ROUND event is to
> sort (kernel) events. And after perf inject, all events are written
> in a proper time order so we don't care about the FINISHED_ROUND event
> anymore.
>
> But the problem is we don't know whether the input data is sorted or
> not (maybe we can add a feature bit for this?) so it should use an
> ordered event queue when processing the input like in perf report.

I like the idea of storing the information that the data is sorted,
and when it's there, let's not use ordered_oevets

>
> Remember all the FINISHED_ROUND events now come before other events so
> the tool cannot know when it can flush the data so everything will be
> queued until it meets the end of the input. Actually it's same for
> perf inject itself as it doesn't flush the queue.
>
> Below measures time and memory usage during the perf inject and
> report using ~190MB data file.
>
> Before:
> perf inject: 11.09 s, 382148 KB
> perf report: 8.05 s, 397440 KB
>
> After:
> perf inject: 16.24 s, 83376 KB
> perf report: 7.96 s, 216184 KB
>
> As you can see, it used 2x memory of the input size. I guess it's
> because it needs to keep the copy for the whole input. But I don't
> understand why processing time of perf inject increased..

would be good to find out first

thanks,
jirka

>
> I'm not sure how it affects the auxtrace, but it should be fine IMHO.
>
> Cc: Al Grant <al.grant@xxxxxxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/builtin-inject.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> index 6d2f410d773a..9fa78a9edfc5 100644
> --- a/tools/perf/builtin-inject.c
> +++ b/tools/perf/builtin-inject.c
> @@ -79,6 +79,7 @@ static int perf_event__repipe_oe_synth(struct perf_tool *tool,
> union perf_event *event,
> struct ordered_events *oe __maybe_unused)
> {
> + ordered_events__flush(oe, OE_FLUSH__ROUND);
> return perf_event__repipe_synth(tool, event);
> }
>
> --
> 2.28.0.806.g8561365e88-goog
>