Re: [RFC][PATCH] Introduce CLOCK_REALTIME_COARSE

From: Andy Lutomirski
Date: Sat Aug 01 2009 - 08:30:10 EST


john stultz wrote:
On Sat, 2009-07-18 at 15:30 -0700, Arjan van de Ven wrote:
On Sat, 18 Jul 2009 15:09:38 -0700
john stultz <johnstul@xxxxxxxxxx> wrote:

After talking with some application writers who want very fast, but
not fine-grained timestamps, I decided to try to implement a new
clock_ids to clock_gettime(): CLOCK_REALTIME_COARSE and
CLOCK_MONOTONIC_COARSE which returns the time at the last tick. This
is very fast as we don't have to access any hardware (which can be
very painful if you're using something like the acpi_pm clocksource),
and we can even use the vdso clock_gettime() method to avoid the
syscall. The only trade off is you only get low-res tick grained time
resolution.
Does this tie us to having a tick? I still have hope that we can get
rid of the tick even when apps are running .... since with CFS we don't
really need the tick for the scheduler anymore for example....

So it does require some sort of periodic interval. But the granularity
is probably flexible, although I'm not sure it would be of much use if
the granularity gets to be lower then 100hz.

While being 100% tickless, even when non-idle would be nice, there will
be some need for timekeeping events to prevent clocksource counters from
wrapping, and to do accurate NTP steering.

Even so, the value we're exporting in this patch is the xtime_cache,
which is updated every tick. This is currently used in file
timestamping, so if we go 100% tickless, we'll have to change the file
timestamping to use the actual CLOCK_REALTIME clock_id, which requires a
possibly slow hardware read and would likely hurt fs performance.

So this patch doesn't so much tie us to having a tick or periodic event
any more the the fs timestamping does.

Hmm. I think that we can have our cake and eat it too, if the machine has a hardware timer that can be turned on and off very cheaply. Just (heh) turn off the tick when an entire tick interval elapses without an access to the cached time.

This is a win if we frequently have two or more consecutive tick intervals without a clock read, it does nothing (except probably a touch of bookkeeping overhead) when someone reads the cached time during each tick interval, and it's a loss (due to excessive reprogramming of the timer) when the cache is read on alternating intervals. Of course, if the cached time is read (several times, anyway) every tick, then having a tick is a good thing because it avoids time source reads.

Getting this to work from the a vsyscall would be tricky. We could have a userspace-readable flag indicating both what time it is and whether the value is accurate and has already been requested this interval (use some sentinal value for the not-requested case, at the cost of a tiny chance the vsyscall doesn't work) and punt to the kernel if this is the first access in any interval.


--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/