Re: [PATCH 2/3] watchdog: control hard lockup detection default

From: Ulrich Obergfell
Date: Thu Jul 24 2014 - 07:18:12 EST


> ----- Original Message -----
> From: "Paolo Bonzini" <pbonzini@xxxxxxxxxx>
> To: "Andrew Jones" <drjones@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, kvm@xxxxxxxxxxxxxxx
> Cc: uobergfe@xxxxxxxxxx, dzickus@xxxxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, mingo@xxxxxxxxxx
> Sent: Thursday, July 24, 2014 12:46:11 PM
> Subject: Re: [PATCH 2/3] watchdog: control hard lockup detection default
>
>Il 24/07/2014 12:13, Andrew Jones ha scritto:
>>
>> The running kernel still has the ability to enable/disable at any
>> time with /proc/sys/kernel/nmi_watchdog us usual. However even
>> when the default has been overridden /proc/sys/kernel/nmi_watchdog
>> will initially show '1'. To truly turn it on one must disable/enable
>> it, i.e.
>> echo 0 > /proc/sys/kernel/nmi_watchdog
>> echo 1 > /proc/sys/kernel/nmi_watchdog
>
> Why is it hard to make this show the right value? :)
>
> Paolo

'echo 1 > /proc/sys/kernel/nmi_watchdog' enables both - hard lockup and
soft lockup detection. watchdog_enable_all_cpus() starts a 'watchdog/N'
thread for each CPU. If the kernel runs on a bare metal system where the
processor does not have a PMU, or when perf_event_create_kernel_counter()
returns failure to watchdog_nmi_enable(), or when the kernel runs as a
guest on a hypervisor that does not emulate a PMU, then the 'watchdog/N'
threads are still active for soft lockup detection. Patch 2/3 essentially
makes watchdog_nmi_enable() behave in the same way as if -ENOENT would
have been returned by perf_event_create_kernel_counter(). This is then
reported via a console message.

NMI watchdog: disabled (cpu0): hardware events not enabled

It's hard say what _is_ 'the right value' (because lockup detection is
then enabled 'partially'), regardless of whether patch 2/3 is applied
or not.

Regards,

Uli
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/