Re: Jiffy based timers/timeouts can expire too soon.
From: George Anzinger
Date: Thu Dec 16 2004 - 15:43:58 EST
Anton Blanchard wrote:
Well, hopefully the lost tick detection code won't over compensate, so
it shouldn't be an issue. However, as Tim Mann pointed out it, due to
interrupt delay and queuing, it is seen on virtualized systems.
We saw this on ppc64 on earlier 2.6 kernels. There were some bugs with
the VM where interrupts would get disabled for a long time (we saw 20+
second periods). A SCSI timeout would occur on another CPU and at that
time irqs would get reenabled and 20 seconds of time would get replayed.
A bunch of timers would go off early and the SCSI adapter would explode.
The problem is that "most" code believes jiffies is right. Under long interrupt
off times, it is not. I suspect that most of the early timers came from code
that set the timer with the interrupt system off. Some might say they got what
they deserved :).
In the HRT patch, we always correct jiffies to the real value (by marking the
TSC value at the last jiffie push and using that plus the current TSC to
correct). It would be rather easy to provide an interface to get the current
real current jiffie, but it is another thing to correct all the code that uses
jiffie. Attempts to make jiffie a macro pick up far too many uses of the word
in several name spaces to make it a reasonable thing to do.
George Anzinger george@xxxxxxxxxx
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/