Re: [RFC PATCH 6/6] timekeeping: Debug missing timekeeping updates

From: John Stultz
Date: Wed Aug 21 2013 - 13:26:07 EST


On 08/21/2013 09:42 AM, Frederic Weisbecker wrote:
> With the full dynticks feature and the tricky full system idle
> detection code that is coming soon, it becomes necessary to have
> some debug code that makes sure that the timekeeping is always
> maintained and moving forward as expected.
>
> This provides a simple detection of missing timekeeping updates
> inspired by the lockup detector's use of CPU cycles clock.
>
> The jiffies are compared to the cpu clock after several snapshots taken
> from NMIs that trigger after arbitrary CPU cycles period overflow.
>
> If the jiffies progression appears to drift too far away from the CPU
> clock's, this triggers a warning.
>
> We just make sure not to account the tiny code on irq entry that
> may have stale jiffies values before tick_check_nohz() is called
> after the CPU is woken up while the system went full idle for some
> time.
>
> Same goes for idle exit in case the tick were stopped but idle
> was polling on need_resched().

So you're using sched_clock to try to detect timekeeping
inconsistencies. Hrm.. Do you have some examples of where this debug
infrastructure helped out?

A few thoughts:

1) Why are you using jiffies as the timekeeping reference instead of
reading some of actual timekeeping values? Jiffies usage has been
intentionally on the decline, and since the dynticks infrastructure
landed, jiffies are just derived from the timekeeping core, so its so
its sort of strange to see it used for this.

2) This seems very similar to the old lost-ticks compensation code we
had prior to the clocksource infrastructure, and seems like it might
suffer from some of the issues seen there. For instance, sched_clock has
been historically looser in its correctness requirements then the
timekeeping code, so using it to validate the more strict timekeeping
code, makes me worry we might see cases of false positives.

3) I'm also curious (maybe skeptical) as if sched_clock is reliable
enough to use for validating time, then we likely are using that same
hardware as the timekeeping clocksource. Thus cases where I'd suspect
you'd see likely issues w/ nohz, like clocksource counter overflows
being missed on quick wrapping clcoksources wouldn't really apply.


Personally, I've been thinking the timekeeping update code could use
some improvements/warnings around cases where update delay is larger
then the clocksource max_deferment - possibly falling back to a slower
overflow-proof multiply as is done in the CLOCK_SOURCE_SUSPEND_NONSTOP
resume case. This would allow more robust behaivor in cases like kvm
guests being paused for unreasonable lengths of time, and could also
provide very similar NOHZ debug warnings (assuming the clocksource
doesn't wrap quickly - but again, in those cases, I'm not confident we
can trust sched_clock either).


thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/