Re: What I suspect : [PATCH] sysdat driver for faster gettimeofday()

Andrea Arcangeli (andrea@suse.de)
Fri, 10 Dec 1999 16:05:21 +0100 (CET)


On Thu, 9 Dec 1999, John Wright wrote:

>How exactly are you figuring the resolution?

See do_gettimeofday() in arch/i386/kernel/time.c:

[..]
void do_gettimeofday(struct timeval *tv)
{
extern volatile unsigned long lost_ticks;
unsigned long flags;
unsigned long usec, sec;

read_lock_irqsave(&xtime_lock, flags);
usec = do_gettimeoffset();
{
unsigned long lost = lost_ticks;
if (lost)
usec += lost * (1000000 / HZ);
}
sec = xtime.tv_sec;
usec += xtime.tv_usec;
read_unlock_irqrestore(&xtime_lock, flags);
[..]

You are losing do_gettimeoffset(). The internals of do_gettimeoffset could
give us a picosecond precision in the retval. Of course the retval is in
usec but this mean that every bit in the tv_usec is significative in the
current do_gettimeoffset. I can't remeber the precision of gettimeoffset
on 486 and 386 but it's very more precise than 10mec as we don't set the
latch-downcount of the timer chip to 1.

You don't need a pentium compile to use the rdtsc based gettimeoffset,
because we do a probe at boot for the CPU capabilities because it's a very
good thing to achieve such a precision in gettimeofday IMHO.

There are two other issue that it seems to me you have not addressed.

The usec and sec fields are not update atomically so if you don't change
the kernel to update them atomically and userspace to read them atomically
you can read weird values from your mapped page. I think on IA32 there's a
kind of cmpxchg8 (or something like that :) that can do that so this looks
fixable for recent IA32 cpus.

The other issue is that you are not accounting lost_ticks so you may read
an obsoleted xtime if the timer_bh is been masked for some time.

>I understand that the initial mmap is higher latency but that is
>just on the initial call... Every one after that is just a read from
>memory.

Infact I was saying so for the initial call and nothing more of course.
After you mapped the page you won't enter the kernel anymore. But if you
avoid also the mmap you get lower startup latency ;). For this reason I
prefer a magic page shared by all tasks than an mmap.

The _only_ way to implement gettimeofday in userspace is to export in a
kernel special page readable from userspace three things: the xtime and
the corresponding number of CPU cycles at cerntain time, plus a constant
to use as conversion between CPU cycles and an unit of time. Then with
rdtsc in userspace we can calculate how much time is passed since the
reference-rdtsc and sum such value to the reference-xtime we have in the
page. Everything else looks race prone or it's unacceptable due a too low
precision of 10msec. All the three values will have to remains static all
the time.

But even this way that I am pointing out now has the downside that if the
constant that does the conversion between CPU cycles and an unit of time
is not _perfect_ we can lose precision after say 10 years of usage. The
larger rdtsc become, the less precise gettimeofday become.

Currently the precision of the convert-constant is not an issue because we
always have a difference between the current rdtsc and the reference rdtsc
that is almost alowas below 10msec, and so lots of least significant bits
gets discarded in the conversion from CPU cycles to usec.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/