Re: [PATCH 19/21] clocksource: Improve comment explaining clocks_calc_max_nsecs()'s 50% safety margin

From: John Stultz
Date: Thu Apr 02 2015 - 13:30:25 EST


On Thu, Apr 2, 2015 at 1:50 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, Apr 01, 2015 at 08:34:39PM -0700, John Stultz wrote:
>> Ingo noted that the description of clocks_calc_max_nsecs()'s
>> 50% safety margin was somewhat circular. So this patch tries
>> to improve the comment to better explain what we mean by the
>> 50% safety margin and why we need it.
>>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Cc: Prarit Bhargava <prarit@xxxxxxxxxx>
>> Cc: Richard Cochran <richardcochran@xxxxxxxxx>
>> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
>> ---
>> kernel/time/clocksource.c | 7 +++++--
>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
>> index c3be3c7..15facb1 100644
>> --- a/kernel/time/clocksource.c
>> +++ b/kernel/time/clocksource.c
>> @@ -472,8 +472,11 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
>> * @max_cyc: maximum cycle value before potential overflow (does not include
>> * any safety margin)
>> *
>> - * NOTE: This function includes a safety margin of 50%, so that bad clock values
>> - * can be detected.
>> + * NOTE: This function includes a safety margin of 50%, in other words, we
>> + * return half the number of nanoseconds the hardware counter can technically
>> + * cover. This is done so that we can potentially detect problems caused by
>> + * delayed timers or bad hardware, which might result in time intervals that
>> + * are larger then what the math used can handle without overflows.
>> */
>> u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cyc)
>> {
>
> Should we make a further note that the tk_fast things rely on this
> slack since they're not strongly serialized against this? That is, they
> can end up using an older cycle_last value and therefore end up with a
> larger delta than other code.

Though, even with the tk_fast bits, we expect the update to happen
regularly, its just that for the benefit of lock-free access we are ok
with the possible slight inconsistencies (in the mono clock) that
could happen if we use a slightly stale value mid-update. So I don't
think the tk_fast bits are actually relying on the slack any more then
the normal timekeeping code relies on the slack to handle slight
delays in processing the updates. If we deal with time deltas large
enough to cause overflows, or time intervals larger then the hardware
can represent, we're sunk in either case. This 50% margin just makes
it easier to catch unexpected delays or issues.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/