Re: [PATCH v0 3/5] perf: Introduce instruction trace filtering

From: Peter Zijlstra
Date: Fri Dec 11 2015 - 10:09:45 EST


On Fri, Dec 11, 2015 at 03:36:36PM +0200, Alexander Shishkin wrote:

> @@ -559,6 +590,10 @@ struct perf_event {
>
> atomic_t event_limit;
>
> + /* instruction trace filters */
> + struct list_head itrace_filters;
> + struct mutex itrace_filters_mutex;
> +
> void (*destroy)(struct perf_event *);
> struct rcu_head rcu_head;
>

> +static int __perf_event_itrace_filters_setup(void *info)
> +{
> + struct perf_event *event = info;
> + int ret;
> +
> + if (READ_ONCE(event->state) != PERF_EVENT_STATE_ACTIVE)
> + return -EAGAIN;
> +
> + /* matches smp_wmb() in event_sched_in() */
> + smp_rmb();
> +
> + /*
> + * There is a window with interrupts enabled before we get here,
> + * so we need to check again lest we try to stop another cpu's event.
> + */
> + if (READ_ONCE(event->oncpu) != smp_processor_id())
> + return -EAGAIN;
> +
> + event->pmu->stop(event, PERF_EF_UPDATE);
> + rcu_read_lock();

So you're holding rcu_read_lock() here to ensure the filter list is
observable. However this is still very much racy, nothing stops another
filter being added while we're trying to validate/program the hardware.

The solution we've used for other such places in perf is to use both a
mutex and a spinlock to protect the list. You need to hold both to
modify a list, holding either ensures the list is stable.

That would allow you to hold the spinlock here, and call the pmu method
on a stable list.

> + ret = event->pmu->itrace_filter_setup(event);
> + rcu_read_unlock();
> + event->pmu->start(event, PERF_EF_RELOAD);
> +
> + return ret;
> +}

> +/*
> + * Insert an itrace @filter into @event's list of filters.
> + * @filter is used as a template
> + */
> +static int perf_itrace_filter_insert(struct perf_event *event,
> + struct perf_itrace_filter *src,
> + struct task_struct *task)
> +{

> + /*
> + * If we're called through perf_itrace_filters_clone(), we're already
> + * holding parent's filter mutex.
> + */
> + mutex_lock_nested(&event->itrace_filters_mutex, SINGLE_DEPTH_NESTING);
> + list_add_tail_rcu(&filter->entry, &event->itrace_filters);
> + mutex_unlock(&event->itrace_filters_mutex);
> +
> + return 0;
> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/