Re: [patch] fix the softlockup watchdog to actually work

From: Andrew Morton
Date: Wed Jul 25 2007 - 04:49:48 EST


On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar <mingo@xxxxxxx> wrote:

> this Xen related commit:
>
> commit 966812dc98e6a7fcdf759cbfa0efab77500a8868
> Author: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
> Date: Tue May 8 00:28:02 2007 -0700
>
> Ignore stolen time in the softlockup watchdog
>
> broke the softlockup watchdog to never report any lockups. (!)
>
> print_timestamp defaults to 0, this makes the following condition
> always true:
>
> if (print_timestamp < (touch_timestamp + 1) ||
>
> and we'll in essence never report soft lockups.
>
> apparently the functionality of the soft lockup watchdog was never
> actually tested with that patch applied ...
>
> [this is -stable material too.]

Still isn't working. I'm getting random meaningless softlockup trippings
coming out for no apparent reason.

I guess softlockup is otherwise busted and this patch enables that
bustedness to be seen.

One possibility is that sched_clock() is bollixed and (say) it's returning
a 32-bit value. That'll cause the softlockup logic to get a bit sick when
time wraps.

This machine (yes it's the Vaio) has marked the TSC unstable but afaict
that's OK.

So I'll shelve this patch for now.


I'm pretty heartily tired of the softlockup thing btw - it's been way more
trouble than benefit. Which is strange, for such a simple thing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/