Re: Needed faster implementation of do_gettimeofday()

From: George Anzinger
Date: Tue Feb 22 2005 - 11:47:10 EST


Puneet Kaushik wrote:
Hello Parag and George,

Thanks for immediate reply.
The main problem is I am working on a SMP system. I have written a small
program that just calls the gettimeofday(), one billion times. I have
run it with time utility and it takes almost double time on SMP then a
UP.



with kernel 2.6.10 on UP

real 4m5.495s
user 1m17.088s
sys 2m48.046s


With Kernel 2.6.10 on SMP

real 6m24.485s
user 1m43.723s
sys 4m30.749s


And the fact is this SMP machine is faster and with more memory than the
UP one. In SMP systems it make a spinlock every time it got called,
synchronizes both the processors, and unlock them. Thats all I know
about it.

On 2.6 the lock is a r/w sequence lock. The machines are not synchronized or locked, but some of the sequence lock instructions around the locking are "locked". I find it hard to believe that this would double the time, however.

Ah..., now I remember. On SMP x86 boxen, the accounting/ run_timer interrupt comes from the lapic timer. This is triggered at a 1/HZ rate and means that there is an additional time keeping interrupt. Actually, over the box, you get (N+1)/HZ interrupts where N is the number of cpus. Assuming that the PIT and the lapic interrupt take about the same amount of time and that the PIT interrupt is evenly distributed on the CPUs, the interrupt contention should go from 1 to 1.5. This alone would take your 4.084 sec UP time to 6.125 sec on an SMP boxen (that is amazingly close to what you are seeing if you ask me).

Again, I recommend my HRT patch. There the accounting interrupt is generated by an "all-but-self" IPI. This is generated by the PIT interrupt code which also does the accounting on the cpu handling the PIT interrupt. Result: total time keeping interrupts N/HZ where N is the number of CPUs.



George I am just working on your suggestion, let me know if it will work
for SMPs.

See above. Should solve your problem.

If there is some good implementation for SMP, please let me know.

Thanks,

- Puneet




On Tue, 2005-02-22 at 08:36, George Anzinger wrote:

Parag Warudkar wrote:

On Sunday 20 February 2005 05:58 am, puneet_kaushik@xxxxxxxxxxxxxxxx wrote:


985913 8.6083 vmlinux mark_offset_tsc
584473 5.1032 libc-2.3.2.so getc


What makes you think mark_offset_tsc is slow? Do you have any comparative numbers? It might just be that the workload you are throwing at it justifies it. (For e.g. if your workload does a zillion system calls, system_call will show up as a hot spot in oprofile - doesn't necessarily mean it is slow - it's just overused.) Can you post the relevant code?

He really is right. Mark offset is reading the PIT counter and that is not only rather dumb but dog slow.

A suggestion, try the high res timers patch. Even if you don't use the timers the mark offset there is MUCH faster. It does not read the PIT.

The difference is where we assume the jiffie bump is in time. If we assume it is at the point that the PIT interrupts, well then the only way to get to that is to read the PIT. If, on the other hand, we assume it is at the time after the interrrupt where we mark offset, we can observe the "best" time for this event based on the TSC and avoid reading the PIT.

Try the HRT patch (see signature below) and see if if doesn't do better.


--
George Anzinger george@xxxxxxxxxx
High-res-timers: http://sourceforge.net/projects/high-res-timers/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/