[PATCH v6 2/2] x86 mce: use new printk recursion disabling interface

From: ShuoX Liu
Date: Wed Jun 06 2012 - 04:36:41 EST


From: ShuoX Liu <shuox.liu@xxxxxxxxx>

On x86 machines, some times MCE happens just when kernel calls printk
to output some log info to serial console, while usually MCE module in
kernel is used to print out some hardware error information, such like
bad cache or bad memory bank. That causes printk recursion and printk
would omit MCE printk output.

We hit it when running MTBF testing on Android ATOM mobiles.

Here in mce_panic, we choose to disable printk recursion to make sure
MCE logs printed out.

Signed-off-by: Yanmin Zhang <yanmin_zhang@xxxxxxxxxxxxxxx>
Signed-off-by: ShuoX Liu <shuox.liu@xxxxxxxxx>
---
v6: move the disabling checking in print_mce

---
arch/x86/kernel/cpu/mcheck/mce.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 2afcbd2..6056e94 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -242,6 +242,7 @@ static void print_mce(struct mce *m)
{
int ret = 0;

+ printk_recursion_check_disable();
pr_emerg(HW_ERR "CPU %d: Machine Check Exception: %Lx Bank %d: %016Lx\n",
m->extcpu, m->mcgstatus, m->bank, m->status);

@@ -275,10 +276,13 @@ static void print_mce(struct mce *m)
* (if the CPU has an implementation for that)
*/
ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
- if (ret == NOTIFY_STOP)
+ if (ret == NOTIFY_STOP) {
+ printk_recursion_check_enable();
return;
+ }

pr_emerg_ratelimited(HW_ERR "Run the above through 'mcelog --ascii'\n");
+ printk_recursion_check_enable();
}

#define PANIC_TIMEOUT 5 /* 5 seconds */
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/