[PATCH] perf/x86: read the FREEZE_WHILE_SMM bit during boot

From: David Arcari
Date: Sun Jun 03 2018 - 14:24:12 EST


On some systems pressing the external NMI button is now failing to inject
an NMI 5-10% of the time. This causes confusion for a user that expects
the NMI to dump the system.

Commit 6089327f5424 ("perf/x86: Add sysfs entry to freeze counters on SMI")
does not read the firmware setting of the FREEZE_WHILE_SMM bit and will
always clear it when the PMU is initialized. As a result the performance
counters will always run and that greatly expands the race in which
external NMI will not be processed if a local NMI is already being
processed.

One option is to change default_do_nmi(). The code snippet below shows the
relevant portion of a patch that resolves the issue, but it is problematic
from a performance perspective and was dismissed.

-345,7 +345,17 @@ static void default_do_nmi(struct pt_regs *regs)
*/
if (handled > 1)
__this_cpu_write(swallow_nmi, true);
- return;
+
+ /*
+ * Unfortunately, there is a race condition which can
+ * result in a missing an external NMI. Typically, an
+ * external NMI is processed on cpu 0. Therefore, on
+ * cpu 0 check for an external NMI before returning.
+ */
+ if (smp_processor_id() ||
+ (x86_platform.get_nmi_reason() & NMI_REASON_MASK) == 0) {
+ return;
+ }
}

Ultimately, the issue can be resolved by storing the default firmware
setting of FREEZE_WHILE_SMM before initializing the PMU.

Fixes: 6089327f5424 ("perf/x86: Add sysfs entry to freeze counters on SMI")

Signed-off-by: David Arcari <darcari@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Donald Zickus <dzickus@xxxxxxxxxx>
Cc: Prarit Bhargava <prarit@xxxxxxxxxx>
Cc: Jerry Hoemann <jerry.hoemann@xxxxxxx>
---
arch/x86/events/intel/core.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 707b2a9..fce98df 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3325,6 +3325,18 @@ static void flip_smm_bit(void *data)
}
}

+static int read_smm_bit(void)
+{
+ u64 val;
+
+ if (!rdmsrl_safe(MSR_IA32_DEBUGCTLMSR, &val)) {
+ if (val & DEBUGCTLMSR_FREEZE_IN_SMM)
+ return 1;
+ }
+
+ return 0;
+}
+
static void intel_pmu_cpu_starting(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
@@ -4423,6 +4435,8 @@ __init int intel_pmu_init(void)
pr_cont("full-width counters, ");
}

+ x86_pmu.attr_freeze_on_smi = read_smm_bit();
+
kfree(to_free);
return 0;
}
--
1.8.3.1