[PATCH] fix race caused by hyperthreads when online an offline cpu

From: Zhou Chengming
Date: Thu Jan 12 2017 - 04:38:28 EST


After online an offline cpu, cpu_hw_events.excl_thread_id will always be
set to 1 in intel_pmu_cpu_starting() even when its sibling's excl_thread_id
is also 1. Then the two siblings will use the same state in their shared
hw_hw_events.excl_cntrs, it will cause race problem.

The race senario is like this:

Two cpu (7 and 19) are siblings, excl_thread_id of 7 and 19 are 0 and 1.
After offline and online cpu 7, intel_pmu_cpu_starting() will set excl_thread_id
of cpu 7 to 1. Then both cpu 7 and 19 will use the same state in their
shared hw_hw_events.excl_cntrs.

cpu7 cpu19
--- ---
intel_start_scheduling()
set state->sched_started = true
intel_put_excl_constraints() {
if (!state->sched_started)
spin_lock // not executed
intel_stop_scheduling()
set state->sched_started = false
if (!state->sched_started)
spin_unlock // excuted

Signed-off-by: NuoHan Qiao <qiaonuohan@xxxxxxxxxx>
Signed-off-by: Zhou Chengming <zhouchengming1@xxxxxxxxxx>
---
arch/x86/events/intel/core.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index a74a2db..593d8c9 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3164,13 +3164,16 @@ static void intel_pmu_cpu_starting(int cpu)

if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
for_each_cpu(i, topology_sibling_cpumask(cpu)) {
+ struct cpu_hw_events *sibling;
struct intel_excl_cntrs *c;

- c = per_cpu(cpu_hw_events, i).excl_cntrs;
+ sibling = &per_cpu(cpu_hw_events, i);
+ c = sibling->excl_cntrs;
if (c && c->core_id == core_id) {
cpuc->kfree_on_online[1] = cpuc->excl_cntrs;
cpuc->excl_cntrs = c;
- cpuc->excl_thread_id = 1;
+ if (!sibling->excl_thread_id)
+ cpuc->excl_thread_id = 1;
break;
}
}
--
1.7.7