Re: gettimeofday non-monotonic on SMP 2.3.47

From: Andrea Arcangeli (andrea@suse.de)
Date: Thu Mar 02 2000 - 13:04:26 EST


On Thu, 2 Mar 2000, Boris Okun wrote:

>I don't quite understand the logic of lost_ticks in do_gettimeofday().

When the timer irq runs, it increases lost_ticks so that if the bh are
inibithed we take care of the additional ticks that passed after we
snapshotted the new TSC state during the timer irq.

>Andrea, could you explain it to me? I think this code really belongs to
>do_fast_gettimeoffset() for the following 2 reasons:
>1) When using do_slow_gettimeoffset() it's erroneus.

do_slow_gettimeoffset is never erroneus, but it's slower and less precise.
So on TSC capable hardware you really prefer to not use it but to use the
TSC at get-time time (instead of doing sloww I/O at each get-time call).

>2) You want to take lost_ticks into account in do_settimeofday() (when
>undoing do_gettimeoffset().

We just do that as far I can tell. See do_settimeofday() that calls
do_gettimeoffset().

>To solve my problem I made the following changes (thanks to Artur
>Skawina):
>1) Made tsc_quotient[NR_CPUS] a per processor value

So far the rule is been that on IA32 SMP machines the TSC goes at the same
speed on all CPUs. If that's not longer true not only do_gettimeoffset
will break. If the TSC runs at different speed userspace will break too.
Things like fftw uses the TSC for doing faster runtime benchmarking to
dynamically tune the algorithms.

And anyway that change is a noop for your problem (see below).

>2) Move calls to calibrate_tsc() to the same places where we call
>calibrate_delay

Since you said you have per-cpu tsc_quotient you could as well skip
calibrate_tsc, so this change meaningless w.r.t. gettimeofday.

>2) Made last_tsc[NR_CPUS] a full 64 bit, instead of just last_tsc_low.

That's useless since if the timer irq is not run for more than 10msec it's
a bug and you would lose time anyway if that would happen.

The CPU should run a 429ghz to overflow the low part of the tsc in 10msec
so we still have a few years before we need to use the high part ;). On a
mean 500mhz cpu it takes around 8 seconds to overflow the low tsc a
8seconds is more than enough for IA32 CPUs.

>3) Made delay_at_last_interrupt[NR_CPUS]

That seems wrong. delay_at_last_interrupt is a per-timer-chip thing and
there's _only_ one timer chip not one chip per CPU.

>4) Deleted lost_ticks code from do_gettimeofday. Use instead the full
>difference
>between current_tsc and last_tsc in do_fast_gettimeoffset(), not just
>LSB.

That's wrong. lost_ticks accounts the time that is going to be added to
xtime but that's not been yet added because of bh inibithed at timer irq
time. At irq time you overwritten last_tsc and with only last_tsc you
can't calculate lost_ticks. So you can get wrong result out of
gettimeofday this way (if the timer irq gets delayed a bit more).

>Now I have a perfectly monotone time and my interactive problems are
>gone -- perfectly usable SMP system with different processors (despite
>all naysayers ;-). If there is any interest I can post a patch.

I guess the fact you handle delay_at_last_interrupt[NR_CPUS] wrong, had
the side effect of hiding the probelm.

See my supposion in my other email:

On Mon, 28 Feb 2000, Andrea Arcangeli wrote:
>
>Date: Mon, 28 Feb 2000 02:32:09 +0100 (CET)
>From: Andrea Arcangeli <andrea@suse.de>
>To: Boris Okun <bokun@home.com>
>Cc: "linux-kernel@vger.rutgers.edu" <linux-kernel@vger.rutgers.edu>
>Subject: Re: gettimeofday non-monotonic on SMP 2.3.47
>
>On Sun, 27 Feb 2000, Boris Okun wrote:
>
>>in synchronize_tsc_bp because of 0 divide. My understanding is that this
>>happens since fast_gettimeoffset_quotient is 0 now. So I put
>>fast_gettimeoffset_quotient = calibrate_tsc();
>>
>>before #ifdef 0 in your patch.
>
>Correct.
>
>>If you have any suggestions, please let me know.
>
>Hmm maybe your timer chip is not reporting the time passed since the last
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>timeout correctly.
 ^^^^^^^^^^^^^^^^^
>
>Andrea
>
>

We just verifyed with my early patch that the TSC is not the problem since
you wasn't using it at all and you was still getting erroneous values.

>From the information I got from you I think it's a timer chip issue and
not a TSC problem at all. If that's is true you can try this patch:

--- 2.3.49pre2/arch/i386/kernel/time.c Sun Jan 30 15:43:27 2000
+++ /tmp/time.c Thu Mar 2 18:37:30 2000
@@ -106,7 +106,7 @@
                  "0" (eax));
 
         /* our adjusted time offset in microseconds */
- return delay_at_last_interrupt + edx;
+ return edx;
 }
 
 #define TICK_SIZE tick
@@ -448,6 +448,7 @@
 
                 rdtscl(last_tsc_low);
 
+#if 0
                 outb_p(0x00, 0x43); /* latch the count ASAP */
 
                 count = inb_p(0x40); /* read the latched count */
@@ -455,6 +456,7 @@
 
                 count = ((LATCH-1) - count) * TICK_SIZE;
                 delay_at_last_interrupt = (count + LATCH/2) / LATCH;
+#endif
         }
  
         do_timer_interrupt(irq, NULL, regs);

This patch will remove from the gettimeofday path any calculation where
the timer chip latch is involved. BUT the above patch inserts a bug and
with the above patch applyed gettimeofday can report wrong results if the
timer irq gets delayed for a few msec by some code running clied. Maybe in
your patch you did something like the above and now you think all is
running fine. I believe you are not getting right the
delay_at_last_interrupt[NR_CPUS] (exactly because it make no sense to make
it per-cpu) and may have the side effect of acting just like what the
above patch will do.

BTW, what does happen if you run an UP compiled kernel? If my supposion is
right you'll still get drift also with an UP kernel. Otherwise it probably
means the early test we did was not reliable.

Hmm...

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:12 EST