[patch] CPU hotplug: call check_tsc_sync_source() with irqs off

From: Ingo Molnar
Date: Wed Mar 07 2007 - 12:13:19 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> [ Ingo and Thomas added to Cc, because I think this is them.. ]
>
> Ingo, I think this came in during commit 95492e4646, "x86: rewrite SMP
> TSC sync code".

yeah.

> > I get this while
> > echo shutdown > /sys/power/disk; echo disk > /sys/power/state
> >
> > BUG: using smp_processor_id() in preemptible [00000001] code: swsusp_shutdown/3359
> > caller is check_tsc_sync_source+0x1b/0xef

Michal, could you try the patch below?

Ingo

----------------------------->
Subject: [patch] CPU hotplug: call check_tsc_sync_source() with irqs off
From: Ingo Molnar <mingo@xxxxxxx>

check_tsc_sync_source() depends on being called with irqs disabled (it
checks whether the TSC is coherent across two specific CPUs). This is
incidentally true during bootup, but not during cpu hotplug __cpu_up().
This got found via smp_processor_id() debugging.

disable irqs explicitly and remove the unconditional enabling of
interrupts. Add touch_nmi_watchdog() to the cpu_online_map busy loop.

this bug is present both on i386 and on x86_64.

Reported-by: Michal Piotrowski <michal.k.k.piotrowski@xxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
arch/i386/kernel/smpboot.c | 16 ++++++++++------
arch/x86_64/kernel/smpboot.c | 5 ++++-
2 files changed, 14 insertions(+), 7 deletions(-)

Index: linux/arch/i386/kernel/smpboot.c
===================================================================
--- linux.orig/arch/i386/kernel/smpboot.c
+++ linux/arch/i386/kernel/smpboot.c
@@ -50,6 +50,7 @@
#include <linux/notifier.h>
#include <linux/cpu.h>
#include <linux/percpu.h>
+#include <linux/nmi.h>

#include <linux/delay.h>
#include <linux/mc146818rtc.h>
@@ -1283,8 +1284,9 @@ void __cpu_die(unsigned int cpu)

int __cpuinit __cpu_up(unsigned int cpu)
{
+ unsigned long flags;
#ifdef CONFIG_HOTPLUG_CPU
- int ret=0;
+ int ret = 0;

/*
* We do warm boot only on cpus that had booted earlier
@@ -1302,23 +1304,25 @@ int __cpuinit __cpu_up(unsigned int cpu)
/* In case one didn't come up */
if (!cpu_isset(cpu, cpu_callin_map)) {
printk(KERN_DEBUG "skipping cpu%d, didn't come online\n", cpu);
- local_irq_enable();
return -EIO;
}

- local_irq_enable();
-
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
/* Unleash the CPU! */
cpu_set(cpu, smp_commenced_mask);

/*
- * Check TSC synchronization with the AP:
+ * Check TSC synchronization with the AP (keep irqs disabled
+ * while doing so):
*/
+ local_irq_save(flags);
check_tsc_sync_source(cpu);
+ local_irq_restore(flags);

- while (!cpu_isset(cpu, cpu_online_map))
+ while (!cpu_isset(cpu, cpu_online_map)) {
cpu_relax();
+ touch_nmi_watchdog();
+ }

#ifdef CONFIG_X86_GENERICARCH
if (num_online_cpus() > 8 && genapic == &apic_default)
Index: linux/arch/x86_64/kernel/smpboot.c
===================================================================
--- linux.orig/arch/x86_64/kernel/smpboot.c
+++ linux/arch/x86_64/kernel/smpboot.c
@@ -923,8 +923,9 @@ void __init smp_prepare_boot_cpu(void)
*/
int __cpuinit __cpu_up(unsigned int cpu)
{
- int err;
int apicid = cpu_present_to_apicid(cpu);
+ unsigned long flags;
+ int err;

WARN_ON(irqs_disabled());

@@ -958,7 +959,9 @@ int __cpuinit __cpu_up(unsigned int cpu)
/*
* Make sure and check TSC sync:
*/
+ local_irq_save(flags);
check_tsc_sync_source(cpu);
+ local_irq_restore(flags);

while (!cpu_isset(cpu, cpu_online_map))
cpu_relax();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/