Re: [PATCH RFC v2 3/8] perf/core: Add support for event removal on exec

From: Peter Zijlstra
Date: Tue Mar 16 2021 - 12:24:37 EST


On Wed, Mar 10, 2021 at 11:41:34AM +0100, Marco Elver wrote:
> Adds bit perf_event_attr::remove_on_exec, to support removing an event
> from a task on exec.
>
> This option supports the case where an event is supposed to be
> process-wide only, and should not propagate beyond exec, to limit
> monitoring to the original process image only.
>
> Signed-off-by: Marco Elver <elver@xxxxxxxxxx>

> +/*
> + * Removes all events from the current task that have been marked
> + * remove-on-exec, and feeds their values back to parent events.
> + */
> +static void perf_event_remove_on_exec(void)
> +{
> + int ctxn;
> +
> + for_each_task_context_nr(ctxn) {
> + struct perf_event_context *ctx;
> + struct perf_event *event, *next;
> +
> + ctx = perf_pin_task_context(current, ctxn);
> + if (!ctx)
> + continue;
> + mutex_lock(&ctx->mutex);
> +
> + list_for_each_entry_safe(event, next, &ctx->event_list, event_entry) {
> + if (!event->attr.remove_on_exec)
> + continue;
> +
> + if (!is_kernel_event(event))
> + perf_remove_from_owner(event);
> + perf_remove_from_context(event, DETACH_GROUP);

There's a comment on this in perf_event_exit_event(), if this task
happens to have the original event, then DETACH_GROUP will destroy the
grouping.

I think this wants to be:

perf_remove_from_text(event,
child_event->parent ? DETACH_GROUP : 0);

or something.

> + /*
> + * Remove the event and feed back its values to the
> + * parent event.
> + */
> + perf_event_exit_event(event, ctx, current);

Oooh, and here we call it... but it will do list_del_even() /
perf_group_detach() *again*.

So the problem is that perf_event_exit_task_context() doesn't use
remove_from_context(), but instead does task_ctx_sched_out() and then
relies on the events not being active.

Whereas above you *DO* use remote_from_context(), but then
perf_event_exit_event() will try and remove it more.

> + }
> + mutex_unlock(&ctx->mutex);

perf_unpin_context(ctx);

> + put_ctx(ctx);
> + }
> +}