Re: [PATCH v2 perf/core] perf script: fix a use after free crash.

From: Krister Johansen
Date: Fri Jan 06 2017 - 01:24:31 EST


On Wed, Jan 04, 2017 at 12:37:39AM -0800, Krister Johansen wrote:
> On Mon, Jan 02, 2017 at 09:30:33PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jan 02, 2017 at 04:39:04PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Mon, Jan 02, 2017 at 02:36:57PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Mon, Jan 02, 2017 at 02:35:30PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Mon, Jan 02, 2017 at 12:15:14PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > {
> > > > zfree(&iter->priv);
> > > > iter->he = NULL;
> > > > + map__zput(al->map);
> > >
> > > What this pairs to? I was expecting that since this is called via:
> > >
> > > hist_entry_iter__add()
> > > {
> > > <SNIP>
> > > err2 = iter->ops->finish_entry(iter, al);
> > > }
> > >
> > > Then it would have to match something done earlier in
> > > hist_entry_iter__add(), most likely by some iter->ops->() method, but I
> > > couldn'd find anything to that extent, can you clarify?
> >
> > With the following patch it has been running all day, care to explain
> > why it is needed? I need to run this on valgrind or with Masami's
> > refcount debugger to get more clues :-\
>
> Let me try a version of this patch that dispenses with the code in both
> fill_callchain_info() and iter_finish_cumulative_entry. However, before
> I do that I'll make sure I can figure out how to reproduce the core that
> you were seeing. I don't want to send you yet another patch that
> doesn't run. The busy system may be the clue I needed. I had been
> running these on a mostly idle system.

I was able to reproduce your assertion failure using more load on a
system. After fixing up the issues caused by incorrectly updating the
refcounts in the hist_entry_iter's function pointers, I re-ran my own
tests and found that I was still getting a failure for 'perf report'.

The problem there seemed to be that a map was getting into a hist entry,
but was freed before the entry was actually created. The call to
hist_entry__init() that took a reference on the map was actually getting
freed memory by the time it tried to increment the reference count.

To address that, I modified hist_entry_iter__add() to take a reference
on the incoming al->map, if one exists. It drops that reference on the
way out of the function. With that change, I'm able to pass both your
'perf top' test under load and my own tests against a kernel without
debug info, but a kmod that has it.

I'll send out a v3 in just a moment.

Thanks again,

-K