Re: [PATCH 1/1] watchdog: avoid extra sys_info dumps for all_bt
From: Bradley Morgan
Date: Mon Jun 22 2026 - 02:36:13 EST
On June 22, 2026 4:23:00 AM GMT+01:00, Feng Tang
<feng.tang@xxxxxxxxxxxxxxxxx> wrote:
>Hi,
>
>On Sat, Jun 20, 2026 at 10:01:40PM +0000, Bradley Morgan wrote:
>> The watchdog handles SYS_INFO_ALL_BT itself. When that is the only
>> watchdog specific bit, sys_info(0) falls back to kernel_sys_info.
>>
>> Skip sys_info() for that case.
>
>Thanks for the patch!
>
>I would explain some about he intention of the global 'kernel_sys_info'
>which was suggested by Andrew. One use case is, for kernel stability
>issues, it could be panic, soft/hard lockup, task-hung etc, and we now
>have different conrol knobs for them each, but with similar 'xxx_sys_info'
>capability, and they actually need similar system info like all-cpu
>dump, blocked processes dump, debug ftrace dump, using one central
>'kernel_sys_info' could be very handy by avoiding setting many knobs.
>And for debugging random issues, you can just add sys_info(0)
>everywhere, and controll it by the existing 'kernel_si_mask'
>
>btw, did you meet some issues with current code? If yes, could you
>agive some more details ? IIUC, when 'kernel_si_mask' is not set
>specifically, sys_info(0) is a nop.
>
>Thanks,
>Feng
Feng!
I see what you are saying, but there is a "double ended sword"! :)
Unfortunately. The issue is where the watchdog specific mask is
explicitly set to ONLY all_bt for instance.
The watchdog only handles SYS_INFO_ALL_BT.
↓
so the later call becomes sys_info(0).
At that point sys_info() cannot tell that the caller had a nonzero
watchdog mask, and it falls back to kernel_sys_info.
Here is a example where this would be reached:
kernel.kernel_sys_info=tasks,mem kernel.hardlockup_sys_info=all_bt
a hard lockup still dumps tasks and memory through the global default.
I expected hardlockup_sys_info to override the global default once it
is explicitly set IMHO.
IF hardlockup_sys_info is empty..
This patch keeps the existing fallback! :)
So the **bug** is not sys_info(0) itself.
but the watchdog turning an explicit
all_bt ONLY mask into zero after handling all_bt locally.
What do you think Feng? Is this good to you? :)
Cheers!
>> Fixes: a9af76a78760 ("watchdog: add sys_info sysctls to dump sys info on
>system lockup")
>> Signed-off-by: Bradley Morgan <include@xxxxxxxxx>
>> ---
>> kernel/watchdog.c | 22 ++++++++++++++++++----
>> 1 file changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 87dd5e0f6968..bad390a9b59e 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -54,6 +54,16 @@ static int __read_mostly
>watchdog_hardlockup_available;
>> struct cpumask watchdog_cpumask __read_mostly;
>> unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
>>
>> +static void watchdog_sys_info(unsigned long si_mask)
>> +{
>> + unsigned long dump_mask = si_mask & ~SYS_INFO_ALL_BT;
>> +
>> + if (si_mask && !dump_mask)
>> + return;
>> +
>> + sys_info(dump_mask);
>> +}
>> +
>> #ifdef CONFIG_HARDLOCKUP_DETECTOR
>>
>> # ifdef CONFIG_SMP
>> @@ -208,6 +218,7 @@ void watchdog_hardlockup_check(unsigned int cpu,
>struct pt_regs *regs)
>> {
>> int hardlockup_all_cpu_backtrace;
>> unsigned int this_cpu;
>> + unsigned long si_mask;
>> unsigned long flags;
>>
>> if (per_cpu(watchdog_hardlockup_touched, cpu)) {
>> @@ -216,7 +227,8 @@ void watchdog_hardlockup_check(unsigned int cpu,
>struct pt_regs *regs)
>> return;
>> }
>>
>> - hardlockup_all_cpu_backtrace = (hardlockup_si_mask & SYS_INFO_ALL_BT) ?
>> + si_mask = READ_ONCE(hardlockup_si_mask);
>> + hardlockup_all_cpu_backtrace = (si_mask & SYS_INFO_ALL_BT) ?
>> 1 : sysctl_hardlockup_all_cpu_backtrace;
>> /*
>> * Check for a hardlockup by making sure the CPU's timer
>> @@ -286,7 +298,7 @@ void watchdog_hardlockup_check(unsigned int cpu,
>struct pt_regs *regs)
>> clear_bit_unlock(0, &hard_lockup_nmi_warn);
>> }
>>
>> - sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT);
>> + watchdog_sys_info(si_mask);
>> if (hardlockup_panic)
>> nmi_panic(regs, "Hard LOCKUP");
>>
>> @@ -798,6 +810,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct
>hrtimer *hrtimer)
>> struct pt_regs *regs = get_irq_regs();
>> int softlockup_all_cpu_backtrace;
>> int duration, thresh_count;
>> + unsigned long si_mask;
>> unsigned long flags;
>>
>> if (!watchdog_enabled)
>> @@ -809,7 +822,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct
>hrtimer *hrtimer)
>> if (panic_in_progress())
>> return HRTIMER_NORESTART;
>>
>> - softlockup_all_cpu_backtrace = (softlockup_si_mask & SYS_INFO_ALL_BT) ?
>> + si_mask = READ_ONCE(softlockup_si_mask);
>> + softlockup_all_cpu_backtrace = (si_mask & SYS_INFO_ALL_BT) ?
>> 1 : sysctl_softlockup_all_cpu_backtrace;
>>
>> watchdog_hardlockup_kick();
>> @@ -900,7 +914,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct
>hrtimer *hrtimer)
>> }
>>
>> add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
>> - sys_info(softlockup_si_mask & ~SYS_INFO_ALL_BT);
>> + watchdog_sys_info(si_mask);
>> thresh_count = duration / get_softlockup_thresh();
>>
>> if (softlockup_panic && thresh_count >= softlockup_panic)
>> --
>> 2.53.0
>