Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]

From: Thomas Gleixner
Date: Wed Jan 09 2008 - 06:57:38 EST


On Mon, 17 Dec 2007, Thomas Gleixner wrote:
> I try to come up with some more debug patches tomorrow.

Sorry took a bit longer than a day :(

Can you apply the patch below + the debug patch which prints the timer
stats on softlockup and provide the output of this.

Thanks,

tglx

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f4a1395..22d921f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -39,6 +39,8 @@ enum tick_nohz_mode {
* @idle_calls: Total number of idle calls
* @idle_sleeps: Number of idle calls, where the sched tick was stopped
* @idle_entrytime: Time when the idle call was entered
+ * @idle_waketime: Time when the idle was interrupted
+ * @idle_exittime: Time when the idle state was left
* @idle_sleeptime: Sum of the time slept in idle with sched tick stopped
* @sleep_length: Duration of the current idle sleep
*/
@@ -52,6 +54,8 @@ struct tick_sched {
unsigned long idle_calls;
unsigned long idle_sleeps;
ktime_t idle_entrytime;
+ ktime_t idle_waketime;
+ ktime_t idle_exittime;
ktime_t idle_sleeptime;
ktime_t sleep_length;
unsigned long last_jiffies;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cb89fa8..206a436 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -133,14 +133,15 @@ void tick_nohz_update_jiffies(void)
if (!ts->tick_stopped)
return;

- touch_softlockup_watchdog();
-
cpu_clear(cpu, nohz_cpu_mask);
now = ktime_get();

local_irq_save(flags);
tick_do_update_jiffies64(now);
local_irq_restore(flags);
+
+ ts->idle_waketime = now;
+ touch_softlockup_watchdog();
}

/**
@@ -369,6 +370,7 @@ void tick_nohz_restart_sched_tick(void)
* Cancel the scheduled timer and restore the tick
*/
ts->tick_stopped = 0;
+ ts->idle_exittime = now;
hrtimer_cancel(&ts->sched_timer);
ts->sched_timer.expires = ts->idle_tick;

diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 12c5f4c..d3d94c1 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -166,6 +166,8 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
P(idle_calls);
P(idle_sleeps);
P_ns(idle_entrytime);
+ P_ns(idle_waketime);
+ P_ns(idle_exittime);
P_ns(idle_sleeptime);
P(last_jiffies);
P(next_jiffies);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/