From: Jeremy Fitzhardinge [mailto:jeremy.fitzhardinge@xxxxxxxxxx]To evaluate the goodness of this, we really need a full
With this in place, I can do a gettimeofday in about 100ns on a 2.4GHz
Q6600. I'm sure this could be tuned a bit more, but it is
already much better than a syscall.
set of measurements for:
a) cost of rdtsc (and rdtscp if different)
b) cost of vsyscall+pvclock
c) cost of rdtsc emulated
d) cost of a hypercall that returns "hypervisor system time"
On a E6850 (3Ghz but let's use cycles), I measured;
a == 72 cycles
c == 1080 cycles
d == 780 cycles
It may be partly apples and oranges, but it looks
like a good guess for b on my machine is
b == 240 cycles
Not bad, but is there any additional context switch
cost to support it?
From: Avi Kivity [mailto:avi@xxxxxxxxxx]Xen does not currently expose rdtscp and so does not emulate
Instead of using vgetcpu() and rdtsc() independently, you can
use rdtscp
to read both atomically. This removes the need for the
preempt notifier.
(or context switch) TSC_AUX. Context switching TSC_AUX
is certainly possible, but will likely be expensive.
If the primary reason for vsyscall+pvclock is to maximize
performance for gettimeofday/clock_gettime, this cost
would need to be added to the mix.
preempt notifiers are per-thread, not global, and will upsetEven if rdtscp is used, in the Intel processor lineup
the cycle
counters. I'd drop them and use rdtscp instead (and give up if the
processor doesn't support it).
only the very latest (Nehalem) supports rdtscp, so
"give up" doesn't seem like a very good option, at least
in the near future.