Re: [BUG] msr-trace.h:42 suspicious rcu_dereference_check() usage!

From: Borislav Petkov
Date: Mon Nov 21 2016 - 11:07:05 EST


+ tglx.

On Mon, Nov 21, 2016 at 04:41:04PM +0100, Peter Zijlstra wrote:
> Right, I just wondered about the !c1e present case. In that case we'll
> forever read the msr in the hope of finding that bit set, and we never
> will.

Well, erratum 400 flag is set on everything F10h from model 0x2 onwards,
which is basically everything that's still out there. And we set it for
most of the K8s too. So I don't think there's an affected machine out
there where we don't enable that erratum for so I'd assume it is turned
on everywhere.

IOW, what's the worst thing that can happen if we did this below?

We basically get rid of the detection and switch the timer to broadcast
mode immediately on the halting CPU.

amd_e400_idle() is behind an "if (cpu_has_bug(c, X86_BUG_AMD_APIC_C1E))"
check so it will run on the affected CPUs only...

Thoughts?

---
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0888a879120f..7ab9d0254b4d 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -354,41 +354,30 @@ void amd_e400_remove_cpu(int cpu)
*/
static void amd_e400_idle(void)
{
- if (!amd_e400_c1e_detected) {
- u32 lo, hi;
+ int cpu = smp_processor_id();

- rdmsr(MSR_K8_INT_PENDING_MSG, lo, hi);
+ if (!static_cpu_has(X86_FEATURE_NONSTOP_TSC))
+ mark_tsc_unstable("TSC halt in AMD C1E");

- if (lo & K8_INTP_C1E_ACTIVE_MASK) {
- amd_e400_c1e_detected = true;
- if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
- mark_tsc_unstable("TSC halt in AMD C1E");
- pr_info("System has AMD C1E enabled\n");
- }
- }
-
- if (amd_e400_c1e_detected) {
- int cpu = smp_processor_id();
+ pr_info("System has AMD C1E enabled\n");

- if (!cpumask_test_cpu(cpu, amd_e400_c1e_mask)) {
- cpumask_set_cpu(cpu, amd_e400_c1e_mask);
- /* Force broadcast so ACPI can not interfere. */
- tick_broadcast_force();
- pr_info("Switch to broadcast mode on CPU%d\n", cpu);
- }
- tick_broadcast_enter();
+ if (!cpumask_test_cpu(cpu, amd_e400_c1e_mask)) {
+ cpumask_set_cpu(cpu, amd_e400_c1e_mask);
+ /* Force broadcast so ACPI can not interfere. */
+ tick_broadcast_force();
+ pr_info("Switch to broadcast mode on CPU%d\n", cpu);
+ }
+ tick_broadcast_enter();

- default_idle();
+ default_idle();

- /*
- * The switch back from broadcast mode needs to be
- * called with interrupts disabled.
- */
- local_irq_disable();
- tick_broadcast_exit();
- local_irq_enable();
- } else
- default_idle();
+ /*
+ * The switch back from broadcast mode needs to be
+ * called with interrupts disabled.
+ */
+ local_irq_disable();
+ tick_broadcast_exit();
+ local_irq_enable();
}

/*

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.