The area I frobbed was not #ifdef'd __SMP__; there's only one section
in my current time.c that is, but that happens to be in the
do_timer_interrupt function (!).
I tracked it down after I noticed my asclock blinking forward (to a
later time) and then back; I figured out how that could cause the
screensaver problems I was seeing, and my X server vendor (XiG)
confirmed that non-monotonic time could cause erratic mouse
behaviour. Then I wrote that gettimeofday() program and saw the
problem in all its gory, and finally I hacked my kernel to prevent
gettimeofday from returning an earlier time. Of course, I didn't
prevent time from increasing, and I still get occasional crashes when
the clock starts racing forward, but I don't know what triggers it.
I have a debugging printk in there now, and I've found that the TSC
calculations seem to be OK. It's the value of delay_at_last_interrupt
that seems to be negative/very large sometimes, and causes the clock
racing. I've tried to trace it out, but I get lost with the 8254
timer stuff.
I've never seen the problem on a single proc box (not even on this
machine before I installed the second CPU). I really hesitate to
guess what might cause it, but interrupts would be something I'd look
at carefully. I found I had more problems when I was doing heavy
serial I/O or burning a CD-ROM.
I'd be willing to look at it again if somebody's interested. I can
reproduce the original problem at will with a stock 2.2.10 kernel. I
haven't upgraded from 2.3.3 because (last I looked) somebody had
broken the VFAT and/or NT filesystems, but I could try a more recent
kernel too.
d.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/