[tip:perf/urgent] perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue

From: tip-bot for Ingo Molnar
Date: Fri Mar 25 2011 - 07:52:45 EST


Commit-ID: 45daae575e08bbf7405c5a3633e956fa364d1b4f
Gitweb: http://git.kernel.org/tip/45daae575e08bbf7405c5a3633e956fa364d1b4f
Author: Ingo Molnar <mingo@xxxxxxx>
AuthorDate: Fri, 25 Mar 2011 10:24:23 +0100
Committer: Ingo Molnar <mingo@xxxxxxx>
CommitDate: Fri, 25 Mar 2011 11:23:41 +0100

perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue

Eric Dumazet reported that hardware PMU events do not work on his
system, due to the BIOS corrupting PMU state:

Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)

Linus suggested that we continue in the face of such BIOS-induced CPU
state corruption:

http://lkml.org/lkml/2011/3/24/608

Such BIOSes will have to be fixed - Linux developers rely on a working and
fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
not acceptable.

So this patch changes perf to continue when it detects such BIOS
interaction, some hardware events may be unreliable due to the BIOS
writing and re-writing them - there's not much the kernel can do
about that but to detect the corruption and report it.

Reported-and-tested-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
arch/x86/kernel/cpu/perf_event.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index ec46eea..eb00677 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -500,12 +500,17 @@ static bool check_hw_exists(void)
return true;

bios_fail:
- printk(KERN_CONT "Broken BIOS detected, using software events only.\n");
+ /*
+ * We still allow the PMU driver to operate:
+ */
+ printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
- return false;
+
+ return true;

msr_fail:
printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
+
return false;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/