On Wed, Jul 27, 2016 at 08:55:28AM -0500, Christoph Lameter wrote:
On Mon, 25 Jul 2016, Christoph Lameter wrote:I had similar issues, this seems to happen when the tsc is considered not reliable
Guess so. I will have a look at this when I get some time again.Ok so the problem is the clocksource_watchdog() function in
kernel/time/clocksource.c. This function is active if
CONFIG_CLOCKSOURCE_WATCHDOG is defined. It will check the timesources of
each processor for being within bounds and then reschedule itself on the
next one.
The purpose of the function seems to be to determine *if* a clocksource is
unstable. It does not mean that the clocksource *is* unstable.
The critical piece of code is this:
/*
* Cycle through CPUs to check if the CPUs stay synchronized
* to each other.
*/
next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
if (next_cpu >= nr_cpu_ids)
next_cpu = cpumask_first(cpu_online_mask);
watchdog_timer.expires += WATCHDOG_INTERVAL;
add_timer_on(&watchdog_timer, next_cpu);
Should we just cycle through the cpus that are not isolated? Otherwise we
need to have some means to check the clocksources for accuracy remotely
(probably impossible for TSC etc).
The WATCHDOG_INTERVAL is 1 second so this causes an interrupt every
second.
Note that we are running with the patch that removes the 1 HZ mininum time
tick. With an older kernel code base (redhat) we can keep the kernel quiet
for minutes. The clocksource watchdog causes timers to fire again.
(which doesn't necessarily mean unstable. I think it has to do with some x86 CPU feature
flag).
IIRC, this _has_ to execute on all online CPUs because every TSCs of running CPUs
are concerned.
I personally override that with passing the tsc=reliable kernel parameter. Of course
use it at your own risk.
But eventually I don't think we can offline that to housekeeping only CPUs.