On Mon, Apr 30, 2012 at 01:56:16PM -0700, John Stultz wrote:Maybe could you clarify a bit more here about how the kernel is plain wrong? Try a clear problem description including examples (before and after you changes)? I'm worried we're talking about different problems.On 04/28/2012 01:04 AM, Richard Cochran wrote:It would be fine to change the leap second status on the tick, butI can synchronize over the network to under 100 nanoseconds, so to me,Well, the leap-offset is a second, but when it arrives is only
one second is a large offset.
tick-accurate. :)
then you must also change the time then, and only then, as well. I
know Linux moved away from this long ago, and the new way is better,
but still what the kernel does today is just plain wrong.
But there is a fix. I just offered it.
If both are adopted (separately) by enough folks, we will have to support both ways at once. That's why I'm trying to suggest we think a bit about how that might be possible.True, although even if it is a hack, google *is* using it. MyIt is either/or, but not both simultaneously.
concern is that if CLOCK_REALTIME is smeared to avoid a leap second
jump, in that environment we cannot also accurate provide a correct
CLOCK_TAI. So far that's not been a problem, because CLOCK_TAI
isn't a clockid we yet support. But the expectations bar always
rises, so I suspect once we have a CLOCK_TAI, someone will want us
to handle smeared-leap seconds without affecting CLOCK_TAI's
correctness.
My proposal does not prevent the smear method in any way. People who
want the smear just never schedule a leap second. People who want the
frequency constant just use the TAI clock interface for the important
work.
We really don't have to support both ways at once.
Again, this should be justified with numbers (try size vmlinux or size ntp.o to generate these). More config options makes code harder to maintain & test, so I'm pushing back a bit here. I also suspect keeping both can be done with very little extra code.
*Any* extra work is a big deal to folks who are sensitive toIt makes your kernel image larger with no added benefit.
clock_gettime performance.
That said, I don't see why its more complicated to also handle leap removal?
3. Use adjtimex() and interpret the timespec and time_state return together to interpret 23:59:60 properly?For users of clock_gettime/gettimeofday, a leapsecond is anI don't buy that argument. Repeating a time_t value leads to ambiguous
inconsistency. Neither interfaces provide a way to detect that the
TIME_OOP flag is set and its not 23:59:59 again, but 23:59:60 (which
can't be represented by a time_t). Thus even if the behavior was
perfect, and the leapsecond landed at exactly the second edge, it is
still a time hiccup to most applications anyway.
Thus, most of userland doesn't really care if the hiccup happens up
to a tick after the second's edge. They don't expect it anyway. So
they really don't want a constant performance drop in order for the
hiccup to be more "correct" when it happens. :)
UTC times, put it is posixly correct. The values are usable together
with difftime(3). Having the time_t go forward and then back again is
certainly worse.
If we leave everything as is, then the user is left with two choices
for data collection applications.
1. Turn off your data system on the night of a leap second.
2. Record data even during a leap second, but post process the files
to fix up all the uglies.
Either way, the kernel has failed us.
You did just suggest we allow for CLOCK_TAI to be broken for folks want to use smeared-leap-seconds. That sounds like an eventual bug too. :) Regardless, the point of my suggestion is that you're likely going to be adding logic to a very hot path, and the resulting benefit has not been clearly stated. Expect push back there (not just from me), so have a strong argument and show overhead impact with numbers to help your case. Alternatively look for other solutions that don't affect the hot-path (like what I suggested w/ adjtimex() above - although that may have downsides too).That's why I'm suggesting that you consider starting by modifying(Introduce yet another kernel bug? No, thanks ;)
the adjtimex() interface. Any application that actually cares about
leapseconds should be using adjtimex() since its the only interface
that allows you to realize that's whats happening. Its not a
performance optimized path, and so its a fine candidate for being
slow-but-correct.
My only concern there is that it would cause problems when mixing
adjtimex() calls with clock_gettime() calls, because you could have
a tick-length of time when they report different time values. But
this may be acceptable.