Re: [PATCH v2] perf: Synchronously cleanup child events

From: Alexei Starovoitov
Date: Tue Jan 19 2016 - 16:58:32 EST


On Tue, Jan 19, 2016 at 09:05:58PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 19, 2016 at 05:12:34PM +0200, Alexander Shishkin wrote:
>
> > +static void __put_event(struct perf_event *event)
> > {
> > struct perf_event_context *ctx;
> >
> > if (!is_kernel_event(event))
> > perf_remove_from_owner(event);
> >
>
> > +int perf_event_release_kernel(struct perf_event *event)
> > {
> > + struct perf_event *child, *tmp;
> > + LIST_HEAD(child_list);
> >
> > + if (!is_kernel_event(event))
> > + perf_remove_from_owner(event);
> >
> > + event->owner = NULL;
> >
> > +retry:
>
> <snip>
>
> > + /*
> > + * If this is the last reference, we're done here, otherwise
> > + * we must have raced with inherit_event(), in which case, repeat
> > + */
> > + if (!put_event_last(event))
> > + goto retry;
> >
> > + return 0;
> > +}
>
> So I think there's a number of problems still :-(
>
> I all starts with having two perf_remove_from_owner() calls (as I
> mentioned on IRC), this doesn't make sense.
>
> I think the moment you close the file and userspace looses control over
> it, we should drop the owner bit, which is exactly the one
> remove_from_owner in perf_release().
>
> If, for some magical reason, the event lives on after that (and we'll
> get to that), it should live on owner-less.
>
> Now, assume someone has such a magical reference, then our
> put_event_last() goto again loop will never terminate, this seems like a
> bad thing.
>
> The most obvious place that generates such magical references would be
> the bpf arraymap doing perf_event_get() on things. There are a few other
> places that take temp references (perf_mmap_close), but those are
> 'short' lived and while ugly will not cause massive grief. The BPF one
> OTOH is a real problem here.
>
> And looking at the BPF stuff, that code seems to assume
> perf_event_kernel_release() := put_event(), so this patch breaks that
> too.
>
>
> Alexei, is there a reason the arraymap stuff needs a perf event ref as
> opposed to a file ref? I'm forever a little confused on how perf<->bpf
> works.

A file ref will not work, since user space could have closed that
perf_event file to avoid unnecessary FDs.
Program only need the stable pointer to 'struct perf_event' which
it will use while running.
At the end it will call perf_event_kernel_release() which
is == put_event().
It was the case that 'perf_events' were normal refcnt-ed structures
and the last guy frees it.
This put_event_last() logic definitely looks problematic.
There are no ordering guarantees.
User space may close FD, while struct perf_event is still alive.
The loop around perf_event_last() looks buggy.
I'm obviously missing the main goal of this patch.