Re: [PATCH 2/3] watchdog: control hard lockup detection default
From: Ulrich Obergfell
Date: Thu Jul 24 2014 - 07:45:00 EST
>----- Original Message -----
>From: "Paolo Bonzini" <pbonzini@xxxxxxxxxx>
>To: "Ulrich Obergfell" <uobergfe@xxxxxxxxxx>
>Cc: "Andrew Jones" <drjones@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, kvm@xxxxxxxxxxxxxxx, dzickus@xxxxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, >mingo@xxxxxxxxxx
>Sent: Thursday, July 24, 2014 1:26:40 PM
>Subject: Re: [PATCH 2/3] watchdog: control hard lockup detection default
>Il 24/07/2014 13:18, Ulrich Obergfell ha scritto:
>>>> >> The running kernel still has the ability to enable/disable at any
>>>> >> time with /proc/sys/kernel/nmi_watchdog us usual. However even
>>>> >> when the default has been overridden /proc/sys/kernel/nmi_watchdog
>>>> >> will initially show '1'. To truly turn it on one must disable/enable
>>>> >> it, i.e.
>>>> >> echo 0 > /proc/sys/kernel/nmi_watchdog
>>>> >> echo 1 > /proc/sys/kernel/nmi_watchdog
>>> > Why is it hard to make this show the right value? :)
>>> > Paolo
>> 'echo 1 > /proc/sys/kernel/nmi_watchdog' enables both - hard lockup and
>> soft lockup detection. watchdog_enable_all_cpus() starts a 'watchdog/N'
>> thread for each CPU. If the kernel runs on a bare metal system where the
>> processor does not have a PMU, or when perf_event_create_kernel_counter()
>> returns failure to watchdog_nmi_enable(), or when the kernel runs as a
>> guest on a hypervisor that does not emulate a PMU, then the 'watchdog/N'
>> threads are still active for soft lockup detection. Patch 2/3 essentially
>> makes watchdog_nmi_enable() behave in the same way as if -ENOENT would
>> have been returned by perf_event_create_kernel_counter(). This is then
>> reported via a console message.
>> NMI watchdog: disabled (cpu0): hardware events not enabled
>> It's hard say what _is_ 'the right value' (because lockup detection is
>> then enabled 'partially'), regardless of whether patch 2/3 is applied
>> or not.
> But this means that it is not possible to re-enable softlockup detection
> only. I think that should be the effect of echo 0 + echo 1, if
> hardlockup detection was disabled by either the command line or patch 3.
The idea was to give the user two options to override the effect of patch 3/3.
Either via the kernel command line ('nmi_watchdog=') at boot time or via /proc
('echo 0' + 'echo 1') when the system is up and running.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/