Re: [GIT PULL] isolation: 1Hz residual tick offloading v4

From: Luiz Capitulino
Date: Mon Jan 29 2018 - 10:33:27 EST


On Mon, 29 Jan 2018 02:10:26 +0100
Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:

> On Wed, Jan 24, 2018 at 10:46:08AM -0500, Luiz Capitulino wrote:
> > On Fri, 19 Jan 2018 01:02:14 +0100
> > Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
> >
> > > Ingo,
> > >
> > > Please pull the sched/0hz-v2 branch that can be found at:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > > sched/0hz-v2
> > >
> > > HEAD: 9b14d5204490f9acd03998a5e406ecadb87cddba
> > >
> > > Changes in v4:
> > >
> > > * Remove the nohz_offload option, just stick with the existing interface,
> > > the change is transparent. Suggested by Luiz.
> > >
> > > * Automatically pin workqueues to housekeepers.
> >
> > I've been testing this series and the tick doesn't go completely away
> > for me: it ticks at around 8 seconds interval.
> >
> > I've debugged this down to the clocksource_watchdog() timer, which is
> > created by clocksource_start_watchdog(). This timer cycles over all online
> > CPUs. I couldn't find a way to disable it. It seems to be always enabled
> > for x86 by CONFIG_CLOCKSOURCE_WATCHDOG since commit 6471b825c4.
> >
> > Since the 1Hz tick offload worked for you, I must be missing a way
> > to disable this timer or the kernel is thinking my CPU has unstable
> > TSC (which it doesn't AFAIK).
>
> It's beyond the scope of this patchset but indeed that's right, I run my
> kernels with tsc=reliable because my CPUs don't have the TSC_RELIABLE flag.
> That's the only way I found to shutdown the tick completely on my test
> machine, otherwise I keep having that clocksource watchdog.
>
> You can try "tsc=reliable" but that's at your own risks and it's hard
> to tell what exactly are those risks depending on your CPU model (and
> perhaps BIOS?).

Cool, passing tsc=reliable worked for me. I finally got to the tick to
go completely away. While I agree that fixing that is beyond the scope
of this series, I think we should improve it anyway since it will probably
come up for people trying the new nohz_full=.

If this has any value:

Tested-by: Luiz Capitulino <lcapitulino@xxxxxxxxxx>

> You likely already had that watchdog timer before this patchset but didn't
> notice because the 1Hz was a more frequent annoyance.

That's possible, but I'm not sure the clocksource watchdog timer was
there before commit 6471b825c4.