Just in time for this year's leap second, this patch series presents aThe trivial cleanups I went ahead and took, but I think the rest still needs some work.
solution for the UTC leap second mess.
Of course, the POSIX UTC system is broken by design, and the Linux
kernel cannot fix that. However, what we can do is correctly execute
leap seconds and always report the time variables (UTC time, TAI
offset, and leap second status) with consistency.
The basic idea is to keep the internal time using a continuous
timescale and to convert to UTC by testing the time value against the
current threshold and adding the appropriate offset. Since the UTC
time and the leap second status is provided on demand, this eliminates
the need to set a timer or to constantly monitor for leap seconds, as
was done up until now.
Patches 2 and 3 are just trivial stuff I saw along the way.
* BenefitsJust to clarify this, so we've got the right scope on the problem, you're trying to address the fact that the leap second is not actually applied until the tick after the leap second, correct?
- Fixes the buggy, inconsistent time reporting surrounding a leap
second event.
- Opens the possibility of offering a rational time source to user
space. [ Trivial to offer clock_gettime(CLOCK_TAI) for example. ]
* Performance ImpactsThis may not be so small when it comes to folks who are very concerned about the clock_gettime hotpath.
** con
- Small extra cost when reading the time (one integer addition plus
one integer test).
** proNot sure I follow this last point. How are we pushing this maintenance to adjtimex() users?
- Removes repetitive, periodic division (secs % 86400 == 0) the whole
day long preceding a leap second.
- Cost of maintaining leap second status goes to the user of the
NTP adjtimex() interface, if any.
* TodoThere's a few cases where we want the current second value when we already hold the xtime_lock, or we might possibly hold the xtime_lock. Its an special internal interface for special users (update_vsyscall, for example).
- The function __current_kernel_time accesses the time variables
without taking the lock. I can't figure that out.