[PATCH] Fix kexec abort due to IPI from panic().

From: Seiji Aguchi
Date: Thu Sep 16 2010 - 16:45:36 EST


Hi,

I'm Seiji Aguchi.
I work for Hitachi Data Systems.
It's a first time to send a patch to lkml.
Nice to meet you.

I found an issue in kexec.
Please give me your comments and suggestions.

Kexec abort when two cpus panic at the same time.
An example scenario:
1. Two cpus panic at the same time .
2. One cpu ,cpu0, get kexec_mutex in crash_kexec().
3. The other cpu ,cpu1, can't get kexec_mutex and return from crash_kexec().
4. Cpu0 runs kmsg_dump(KMSG_DUMP_KEXEC).
5. Cpu1 can't get dump_list_lock and return from kmsg_dump(KMSG_DUMP_PANIC).
6. Cpu1 runs smp_send_stop() in panic() and sends IPI to other cpus.
7. Cpu0 may receive IPI from cpu1 while running kmsg_dump(KMSG_DUMP_KEXEC),
crash_setup_regs(), or crash_save_vmcore().

We can solve this issue by disabling external interrupt while getting kexec_mutex
in crash_kexec().


Signed-off-by: Seiji Aguchi <seiji.aguchi@xxxxxxx>

---
kernel/kexec.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index c0613f7..9e9f159 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1075,6 +1075,10 @@ void crash_kexec(struct pt_regs *regs)
* sufficient. But since I reuse the memory...
*/
if (mutex_trylock(&kexec_mutex)) {
+ unsigned long flags;
+
+ local_irq_save(flags);
+
if (kexec_crash_image) {
struct pt_regs fixed_regs;

@@ -1085,6 +1089,9 @@ void crash_kexec(struct pt_regs *regs)
machine_crash_shutdown(&fixed_regs);
machine_kexec(kexec_crash_image);
}
+
+ local_irq_restore(flags);
+
mutex_unlock(&kexec_mutex);
}
}
--
1.7.2.2


Regards,

Seiji
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/