Re: [patch V3 23/32] perf/tracing/cpuhotplug: Fix locking order

From: Peter Zijlstra
Date: Tue May 30 2017 - 07:22:50 EST


On Wed, May 24, 2017 at 11:30:18AM -0700, Paul E. McKenney wrote:
> > @@ -8920,7 +8912,7 @@ perf_event_mux_interval_ms_store(struct
> > pmu->hrtimer_interval_ms = timer;
> >
> > /* update all cpuctx for this PMU */
> > - get_online_cpus();
> > + cpus_read_lock();
>
> OK, I'll bite...
>
> Why is this piece using cpus_read_lock() instead of pmus_lock?
>
> My guess is for the benefit of the cpu_function_call() below, but if
> the code instead cycled through the perf_online_mask, wouldn't any
> CPU selected be guaranteed to be online?
>
> Or is there some reason that it would be necessary to specially handle
> CPUs that perf does not consider to be active, but that are still at
> least partway online?

Mostly just lazy. This code path didn't present a problem with the lock
ordering. Find the conversion below.

>
> > for_each_online_cpu(cpu) {
> > struct perf_cpu_context *cpuctx;
> > cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
> > @@ -8929,7 +8921,7 @@ perf_event_mux_interval_ms_store(struct
> > cpu_function_call(cpu,
> > (remote_function_f)perf_mux_hrtimer_restart, cpuctx);
> > }
> > - put_online_cpus();
> > + cpus_read_unlock();
> > mutex_unlock(&mux_interval_mutex);
> >
> > return count;


---
Subject: perf: Complete CPU hotplug conversion

Remove the last cpuc_read_lock() user in perf in favour of our internal
state. This conversion is non critical as the lock ordering wasn't
problematic but its nice to be consistent.

Reported-by: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/events/core.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8d6acaeeea17..ad4f7f03b519 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -389,6 +389,16 @@ static atomic_t nr_switch_events __read_mostly;
static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
static struct srcu_struct pmus_srcu;
+
+/*
+ * CPU hotplug handling, also see perf_event_{exit,init}_cpu().
+ *
+ * We use @pmus_lock to serialize PMU (un)registration against CPU hotplug,
+ * tracking the online state in @perf_online_mask and
+ * pmu->pmu_cpu_context->online. That latter is set while holding ctx->mutex
+ * and therefore holding ctx->mutex is sufficient to serialize against
+ * hotplug wrt cpuctx->online.
+ */
static cpumask_var_t perf_online_mask;

/*
@@ -8887,8 +8897,6 @@ perf_event_mux_interval_ms_show(struct device *dev,
return snprintf(page, PAGE_SIZE-1, "%d\n", pmu->hrtimer_interval_ms);
}

-static DEFINE_MUTEX(mux_interval_mutex);
-
static ssize_t
perf_event_mux_interval_ms_store(struct device *dev,
struct device_attribute *attr,
@@ -8908,12 +8916,12 @@ perf_event_mux_interval_ms_store(struct device *dev,
if (timer == pmu->hrtimer_interval_ms)
return count;

- mutex_lock(&mux_interval_mutex);
+ /* use pmus_lock to order against hotplug and self serialize */
+ mutex_lock(&pmus_lock);
pmu->hrtimer_interval_ms = timer;

/* update all cpuctx for this PMU */
- cpus_read_lock();
- for_each_online_cpu(cpu) {
+ for_each_cpu(cpu, perf_online_mask) {
struct perf_cpu_context *cpuctx;
cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
cpuctx->hrtimer_interval = ns_to_ktime(NSEC_PER_MSEC * timer);
@@ -8921,8 +8929,7 @@ perf_event_mux_interval_ms_store(struct device *dev,
cpu_function_call(cpu,
(remote_function_f)perf_mux_hrtimer_restart, cpuctx);
}
- cpus_read_unlock();
- mutex_unlock(&mux_interval_mutex);
+ mutex_unlock(&pmus_lock);

return count;
}