Re: [PATCH v2] perf data: Fix releasing event_class

From: Arnaldo Carvalho de Melo
Date: Thu Jan 28 2016 - 14:09:40 EST


Em Thu, Jan 28, 2016 at 03:59:23PM +0100, Jiri Olsa escreveu:
> On Thu, Jan 28, 2016 at 01:20:28PM +0000, Wang Nan wrote:
> > A new patch of libbabeltrace [1] reveals a object leak problem in
> > perf data CTF support: perf code never release event_class which is
> > allocated in add_event() and stored in evsel's private field.
> >
> > If libbabeltrace has the above patch applied, leaking event_class
> > prevent writer being destroied and flushing metadata. For example:
> >
> > $ ./perf record ls
> > Lowering default frequency rate to 500.
> > Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
> > perf.data
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
> > $ ./perf data convert --to-ctf ./out.ctf
> > [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
> > [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
> > $ cat ./out.ctf/metadata
> > $ ls -l ./out.ctf/metadata
> > -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata
> >
> > The correct result should be:
> > ...
> > $ cat ./out.ctf/metadata
> > /* CTF 1.8 */
> >
> > trace {
> > [SNIP]
> >
> > $ ls -l ./out.ctf/metadata
> > -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata
> >
> > The full story is:
> >
> > Patch [1] of babeltrace redesign reference counting scheme. In that
> > patch:
> >
> > * writer <- trace (bt_ctf_writer_create)
> > * trace <- stream_class (bt_ctf_trace_add_stream_class)
> > * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
> > ('<-' means 'is a parent of')
> >
> > Holding of event_class causes reference count of corresponding
> > 'writer' increases through parent chain. Perf expect 'writer' is
> > released (so metadata is flushed) through bt_ctf_writer_put() in
> > ctf_writer__cleanup(). However, since it never release event_class,
> > the reference of 'writer' won't be reduced, so bt_ctf_writer_put()
> > won't lead releasing of writer.
> >
> > Before this CTF patch, !(writer <- trace). Even event_class leak,
> > writer is able to be released.
> >
> > [1] https://github.com/efficios/babeltrace/commit/e6a8e8e4744633807083a077ff9f101eb97d9801
> >
> > Signed-off-by: Wang Nan <wangnan0@xxxxxxxxxx>
> > Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> > Cc: Jérémie Galarneau <jeremie.galarneau@xxxxxxxxxxxx>
> > Cc: Zefan Li <lizefan@xxxxxxxxxx>
> > Cc: pi3orama@xxxxxxx
> > ---
> >
> > v1 -> v2: Free evsel->priv, destroy evlist
> > (even 'perf report' doesn't destroy evlist created by
> > perf_session__open)
>
> Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>
>
>
> Arnaldo,
> we're missing perf_evlist__delete in report command
>
> I remember there were several cleanups skipped intentionaly,
> but I dont think this one is the case.. also report TUI provides
> the 's' key to load new data file.. which makes this leak
> important
>
> however even with this patch I can see perf report getting
> more memory every time it reloads the data.. either there's
> something else, or I'm looking at it wrong ;-)

So, for the last delete, IIRC that teardown was removed because it made
the exit operation to be slower than just not calling
perf_evlist__delete (or was it perf_session__delete, don't recall right
now, probably the later, because it would delete all the threads, mmaps,
etc).

But yeah, for switching perf.data files on the fly, we better free all
that up...

> thanks,
> jirka
>
>
> ---
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 2bf537f190a0..7a4a27a6f053 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -970,12 +970,14 @@ repeat:
>
> ret = __cmd_report(&report);
> if (ret == K_SWITCH_INPUT_DATA) {
> + perf_evlist__delete(session->evlist);
> perf_session__delete(session);
> goto repeat;
> } else
> ret = 0;
>
> error:
> + perf_evlist__delete(session->evlist);
> perf_session__delete(session);
> return ret;
> }