[PATCH v2 1/2] apic: Fix error interrupt report at all APs

From: Youquan Song
Date: Fri Jan 07 2011 - 11:39:22 EST


Recently, customer report that once machine boot, there are many error interrupt
reported with exact number of all APs.

The root cause is Local APIC will generate error interrupt when it detect
the illegal vector (one in 0 ~ 15) in an interrupt message received or
interrupt generate from local vector table or self IPI. SDM3A.chapter 10.

AP LAPIC thermal sensor register will be reset to 0x10000, if thermal throttling
interrupt take over by BIOS, it need restore AP with the thermal sensor register
value of geting from BSP, otherwise cause system issue. If BIOS does not take
over the thermal interrupt, The restore value will be CPU rest value of 0x10000,
which means the interrupt vector is zero. After writing 0x10000 to thermal
sensor LVT, the processor will recieve the error interrupt report if the APIC
error interrupt is also set.

This patch add check the BIOS whether take over the thermal interrupt by look
at interrupt delivery mode not fixed mode(BIOS handle will be SMI mode) before
restore AP's thermal LVT. So the agony noise of error interrupt will dismiss
when boot on machine that BIOS does not handle thermal interrupt..


Signed-off-by: Youquan Song <youquan.song@xxxxxxxxx>
---
arch/x86/include/asm/apicdef.h | 1 +
arch/x86/kernel/cpu/mcheck/therm_throt.c | 12 +++++++-----
2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index a859ca4..f587be0 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -78,6 +78,7 @@
#define APIC_DEST_LOGICAL 0x00800
#define APIC_DEST_PHYSICAL 0x00000
#define APIC_DM_FIXED 0x00000
+#define APIC_DM_FIXED_MASK 0x00700
#define APIC_DM_LOWEST 0x00100
#define APIC_DM_SMI 0x00200
#define APIC_DM_REMRD 0x00300
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 4b68326..ecd6992 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -405,18 +405,20 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
*/
rdmsr(MSR_IA32_MISC_ENABLE, l, h);

+ h = lvtthmr_init;
/*
* The initial value of thermal LVT entries on all APs always reads
* 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
* sequence to them and LVT registers are reset to 0s except for
* the mask bits which are set to 1s when APs receive INIT IPI.
- * Always restore the value that BIOS has programmed on AP based on
- * BSP's info we saved since BIOS is always setting the same value
- * for all threads/cores
+ * If BIOS take over the thermal interrupt and set its interrupt
+ * delivery mode to SMI not fixed, it restore the value that BIOS has
+ * programmed on AP based on BSP's info we saved since BIOS is always
+ * setting the same value for all threads/cores.
*/
- apic_write(APIC_LVTTHMR, lvtthmr_init);
+ if (h & APIC_DM_FIXED_MASK)
+ apic_write(APIC_LVTTHMR, lvtthmr_init);

- h = lvtthmr_init;

if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) {
printk(KERN_DEBUG
--
1.6.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/