[BUG]: hrtimer overflow bug on 64-bit systems

From: David Miller
Date: Thu May 24 2007 - 18:06:47 EST



I've been tracing this problem on sparc64 now that it uses hi-res
timers, and I finally figured it out.

The symptom is that ksoftirqd on most of the cpus of an idle SMP
system chew up around %10 of cpu time.

The raise_softirq_irqoff() is constantly being invoked via
tick_nohz_stop_sched_tick(), but why?

Tracing revealed that delta_ticks is enormous, which is correct
because no timers are pending on the cpu so we should schedule the
hrtimer as far into the future as possible.

tick_nohz_stop_sched_tick() then proceeds to start the hrtimer, and
then it checks hrtimer_active() but for some reason this is always
false.

Why?

The reason is that the expires calculation here:

expires = ktime_add_ns(last_update, tick_period.tv64 *
delta_jiffies);

overflows on 64-bit systems.

On 32-bit systems, LONG_MAX is a 32-bit quantity so the largest
possible delta_jiffies don't cause an overflow in this multiply
(which is 64-bit). (LONG_MAX determines how large a value
will be returned to indicate "infinity" from get_next_timer_interrupt(),
specifically it's the initialization of local variable 'expires'
in __next_timer_interrupt).

Because of this 'expires' is zero and of course that causes the
hrtimer to not get scheduled at all.

I'm surprised this problem is not seen with the x86_64 hrtimer patches
applied :-)

I'm not exactly sure how to best fix this, we could either make
__next_timer_interrupt use INT_MAX or we could make
tick_nohz_stop_sched_tick() realize and handle the overflow
properly. Neither of those solutions are fully satisfactory in
my opinion :-)

FWIW, I've verified that using INT_MAX in __next_timer_interrupt()
makes the problem go away.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/