[patch] softlockup: fix false positives on nohz if CPU is 100%idle for more than 60 seconds

From: Ingo Molnar
Date: Wed Apr 23 2008 - 04:50:59 EST



* David Miller <davem@xxxxxxxxxxxxx> wrote:

> > +++ linux/kernel/time/tick-sched.c
> > @@ -393,6 +393,7 @@ void tick_nohz_restart_sched_tick(void)
> > sub_preempt_count(HARDIRQ_OFFSET);
> > }
> >
> > + touch_softlockup_watchdog();
> > /*
> > * Cancel the scheduled timer and restore the tick
> > */
>
> The NOHZ lockup warnings are gone. But this seems like a band-aid.
> We made sure that cpus don't get into this state via commit:

but that commit:

> @@ -133,6 +133,8 @@ void tick_nohz_update_jiffies(void)
> if (!ts->tick_stopped)
> return;
>
> + touch_softlockup_watchdog();
> +
> cpu_clear(cpu, nohz_cpu_mask);
> now = ktime_get();

updates the softlockup watchdog before we update jiffies:

> ts->idle_waketime = now;
>
> local_irq_save(flags);
> tick_do_update_jiffies64(now);
> local_irq_restore(flags);

so the patch i gave you first should do the trick - the cleaner patch is
attached below.

> While what the guilty patch we're discussing here does is change how
> cpu_clock() is computed, that's it. softlockup uses cpu_clock() to
> calculate it's timestamp. The guilty change modified nothing about
> when touch_softlockup_watchdog() is called, nor any other aspect about
> how the softlockup mechanism itself works.
>
> So we need to figure out why in the world changing how cpu_clock()
> gets calculated makes a difference.

the thing is that that change _fixed_ cpu_clock() - and exposed the
latent softlockup/nohz interaction that was always there and which
caused false positives if a CPU stayed idle for more than 60 seconds.

Ingo

------------------>
Subject: softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds
From: Ingo Molnar <mingo@xxxxxxx>
Date: Wed Apr 23 10:01:08 CEST 2008

David Miller reported softlockup false positives on his 128-way Niagara2
system. The special thing about that system is extremely long clockevent
timeouts combined with extremly long idle times.

Fix rq->clock update bug that can trigger on such a system: in
tick_nohz_update_jiffies() [which is called on all irq entry on all cpus
where the irq entry hits an idle cpu] we call
touch_softlockup_watchdog() before we update jiffies. That works fine
most of the time when idle timeouts are within 60 seconds. But when an
idle timeout is beyond 60 seconds, jiffies is updated with a jump of
more than 60 seconds, which causes a jump in cpu-clock of more than 60
seconds, triggering a false positive.

Reported-by: David Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/time/tick-sched.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/kernel/time/tick-sched.c
===================================================================
--- linux.orig/kernel/time/tick-sched.c
+++ linux/kernel/time/tick-sched.c
@@ -133,8 +133,6 @@ void tick_nohz_update_jiffies(void)
if (!ts->tick_stopped)
return;

- touch_softlockup_watchdog();
-
cpu_clear(cpu, nohz_cpu_mask);
now = ktime_get();
ts->idle_waketime = now;
@@ -142,6 +140,8 @@ void tick_nohz_update_jiffies(void)
local_irq_save(flags);
tick_do_update_jiffies64(now);
local_irq_restore(flags);
+
+ touch_softlockup_watchdog();
}

void tick_nohz_stop_idle(int cpu)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/