[PATCH] softlockup detection vs. cpu hotplug

From: Nathan Lynch
Date: Thu Feb 23 2006 - 19:24:56 EST


I'm able to trigger bogus softlockup warnings during cpu hotplug
testing, due to the percpu timestamp not being re-initialized before
the cpu starts servicing timer interrupts.

Before starting a cpu's watchdog thread, touch the cpu's timestamp to
avoid false positives.

In the watchdog thread, do touch_softlockup_watchdog in a
non-preemptible section so that it won't touch another cpu's
timestamp. This can happen in the window between the watchdog thread
getting forcefully migrated during a cpu offline operation and
kthread_should_stop.

Signed-off-by: Nathan Lynch <ntl@xxxxxxxxx>

---

kernel/softlockup.c | 20 ++++++++++++++++----
1 files changed, 16 insertions(+), 4 deletions(-)

33426e0c29ad973f9107cbd872648050f8988e61
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index c67189a..bd86fe1 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -34,9 +34,14 @@ static struct notifier_block panic_block
.notifier_call = softlock_panic,
};

+static void __touch_softlockup_watchdog(int cpu)
+{
+ per_cpu(timestamp, cpu) = jiffies;
+}
+
void touch_softlockup_watchdog(void)
{
- per_cpu(timestamp, raw_smp_processor_id()) = jiffies;
+ __touch_softlockup_watchdog(raw_smp_processor_id());
}
EXPORT_SYMBOL(touch_softlockup_watchdog);

@@ -73,6 +78,7 @@ void softlockup_tick(struct pt_regs *reg
static int watchdog(void * __bind_cpu)
{
struct sched_param param = { .sched_priority = 99 };
+ unsigned long bind_cpu = (unsigned long)__bind_cpu;

sched_setscheduler(current, SCHED_FIFO, &param);
current->flags |= PF_NOFREEZE;
@@ -86,10 +92,15 @@ static int watchdog(void * __bind_cpu)
*/
while (!kthread_should_stop()) {
msleep_interruptible(1000);
- touch_softlockup_watchdog();
+ /* When our cpu is offlined the watchdog thread can
+ * get migrated before it is stopped.
+ */
+ preempt_disable();
+ if (likely(smp_processor_id() == bind_cpu))
+ touch_softlockup_watchdog();
+ preempt_enable();
+ __set_current_state(TASK_RUNNING);
}
- __set_current_state(TASK_RUNNING);
-
return 0;
}

@@ -112,6 +123,7 @@ cpu_callback(struct notifier_block *nfb,
}
per_cpu(watchdog_task, hotcpu) = p;
kthread_bind(p, hotcpu);
+ __touch_softlockup_watchdog(hotcpu);
break;
case CPU_ONLINE:

--
1.1.5

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/