Re: [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores

From: Don Zickus
Date: Thu Apr 02 2015 - 09:36:18 EST

On Tue, Mar 31, 2015 at 02:30:44PM -0400, Chris Metcalf wrote:
> On 03/31/2015 03:25 AM, Ingo Molnar wrote:
> >* cmetcalf@xxxxxxxxxx <cmetcalf@xxxxxxxxxx> wrote:
> >
> >>From: Chris Metcalf <cmetcalf@xxxxxxxxxx>
> >>
> >>Running watchdog can be a helpful debugging feature on regular
> >>cores, but it's incompatible with nohz_full, since it forces
> >>regular scheduling events. Accordingly, just exit out immediately
> >>from any nohz_full core.
> >>
> >>An alternate approach would be to add a flags field or function to
> >>smp_hotplug_thread to control on which cores the percpu threads
> >>are created, but it wasn't clear that much mechanism was useful.
> >>
> >>[...]
> >So what happens if someone wants to enable the lockup detector, with a
> >long timeout, even on nohz-full CPUs? This patch makes that
> >impossible.
> >
> >A better solution would be to tweak the defaults:
> >
> > - to default the watchdog(s) to disabled when nohz-full is
> > enabled, even if HARDLOCKUP_DETECTOR=y or DETECT_HUNG_TASK=y, and
> > allow it to be re-enabled via its sysctl.
> That's certainly a reasonable thing to do; it looks like just an #ifdef
> at the top of watchdog.c would suffice. Does this look right?
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 8a46d9d8a66f..c8555c211e65 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -25,7 +25,11 @@
> #include <linux/kvm_para.h>
> #include <linux/perf_event.h>
> +int watchdog_user_enabled = 0;
> +#else
> int watchdog_user_enabled = 1;
> +#endif
> int __read_mostly watchdog_thresh = 10;
> #ifdef CONFIG_SMP
> int __read_mostly sysctl_softlockup_all_cpu_backtrace;
> It doesn't look like I need to do anything else special to disable
> HARDLOCKUP_DETECTOR, and khungtaskd can happily run on
> a non-nohz core, so that should be OK.
> What I was trying to achieve with my proposed patch was kind
> of orthogonal: to allow the watchdog to run on standard cores,
> but not run on nohz cores, so we could benefit from it on the
> cores where it was safe for it to run. Do you see value in this,
> or better to just enable/disable all watchdog threads collectively?

Hmm, I am not sure I am a big fan of this approach. I know RHEL keeps the
watchdogs enabled for customers and it would be a regression if we disabled
it. And at the same time, I could see RHEL leaning towards enabling
CONFIG_NO_HZ_FULL, which would just delay this problem a number of years
until RHEL-8 gets around to ramping up.

So I guess I would prefer to figure out a better co-existing solution now.

Can I ask how the NO_HZ_FULL technology works from userspace? Is there a
system command that has to be sent? How does the kernel know to turn off
ticks and trust userspace to do the right thing?


> --
> Chris Metcalf, EZChip Semiconductor
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at