Re: Hang and Soft Lockup problems with generic time code
From: john stultz
Date: Fri Jul 07 2006 - 19:37:58 EST
On Fri, 2006-07-07 at 18:11 -0500, James Bottomley wrote:
> Ever since the 2.6.17 kernel pulled in the generic timer code, I've been
> experiencing hangs and softlockups with the aic94xx driver (which I
> thought were driver related). Finally, after a lot of debugging I've
> isolated the culprit to linux/time.h:timespec_add_ns()
>
> What is happening is that a->tv_nsec is coming in here negative and
> looping for huge amounts of time.
Yep. This has been seen where a large number of ticks are lost. Roman
and I are working on a solution for this (I sent a patch out to the list
earlier today for it, and Roman *just* posted his version a moment ago -
if you can give one or both of them a try it would be appreciated).
> Why tv_nsec is negative appears to be related to massive cycle
> adjustments in kernel/timer.c:update_wall_time(). With the TSC as my
> clocksource I've seen the clocksource_read() return increments of in the
> 200s range. No idea why this is happening. The same strange
> discontinuous jumps in cycle count also occurs with pm_acpi as the clock
> source.
Did you really mean jumps of 200 seconds? Hmmm. The issue Roman and I
have been looking into does occur when we lose a number of ticks and
that confuses the clocksource adjustment code. The fix we're working on
corrects the adjustment confusion, but doesn't fix the lost ticks.
However 200 seconds of lost ticks sounds very off. Could the driver be
disabling interrupt for such a long period of time?
> I can't get a good enough handle on all the generic time code changes to
> reverse them. However, this machine is a P4, so I was able to boot it
> with an x86_64 kernel (which doesn't yet use the generic time code) and
> confirm that all the hangs and softlockups go away.
>
> The machine in question is an IBM x206m dual core P4.
I appreciate the report and apologize for the trouble.
thanks
-john
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/