Re: [PATCH v3 4/4] panic: use sys_info_with_filter() to avoid duplicate backtraces

From: Bradley Morgan

Date: Fri Jun 26 2026 - 08:20:05 EST


On June 26, 2026 1:14:14 PM GMT+01:00, Petr Mladek <pmladek@xxxxxxxx>
wrote:
>On Fri 2026-06-26 12:23:50, Petr Mladek wrote:
>> On Thu 2026-06-25 15:25:58, Bradley Morgan wrote:
>> > panic_other_cpus_shutdown() handles SYS_INFO_ALL_BT before stopping
>the
>> > other CPUs. Do not ask sys_info() to handle that bit again later in
>the
>> > panic path.
>> >
>> > Use sys_info_with_filter() so panic_print=all_bt does not request more
>> > output after the CPUs are stopped.
>> >
>> > Fixes: a9af76a78760 ("watchdog: add sys_info sysctls to dump sys info
>on system lockup")
>> > Cc: stable@xxxxxxxxxxxxxxx
>> > Signed-off-by: Bradley Morgan <include@xxxxxxxxx>
>> > ---
>> > kernel/panic.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/kernel/panic.c b/kernel/panic.c
>> > index 213725b612aa..eb842823df61 100644
>> > --- a/kernel/panic.c
>> > +++ b/kernel/panic.c
>> > @@ -680,7 +680,7 @@ void vpanic(const char *fmt, va_list args)
>> > */
>> > atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
>> >
>> > - sys_info(panic_print);
>> > + sys_info_with_filter(panic_print, SYS_INFO_ALL_BT);
>>
>> Hmm, this prevents printing backtraces from all CPUs completely.
>> But what if they were not printed?
>>
>> They might be printed by:
>>
>> static void panic_other_cpus_shutdown(bool crash_kexec)
>> {
>> if (panic_print & SYS_INFO_ALL_BT)
>> panic_trigger_all_cpu_backtrace();
>>
>> [...]
>> }
>>
>> But it checks only "panic_print" variable. It won't do anything
>> when (panic_print == 0).
>>
>> In this case, we might still want to print the backraces when
>> SYS_INFO_ALL_BT is set in kernel_si_info.
>>
>> > kmsg_dump_desc(KMSG_DUMP_PANIC, buf);
>>
>> Of course, we might fix panic_other_cpus_shutdown() to check also
>> kernel_si_info.
>>
>> But it all becomes very hairy. We have several levels:
>>
>> + watchdog-all_bt-specific option, e.g.
>sysctl_hardlockup_all_cpu_backtrace
>>
>> + watchdog-specific si_info preferences, e.g. hardlockup_si_mask
>>
>> + panic-specific si_info: panic_print
>>
>> + universal fallback for any layer: kernel_si_info
>>
>> Now, we try to check all these variables back and forth to
>> trigger all backtraces or to avoid triggering them.
>> And it clearly does not work well and the code is more and more
>> hairy.
>>
>> I think about another approach. The word "waterfall" comes to my mind.
>> Instead of checking all the settings back and forth, let's process
>> each setting one by one and just remember what has been done and
>> skip this in the next level.
>>
>> All the si_info actions seems to dump a global system state.
>> So, it would make sense to remember the state in a global variable
>> even when it might be modified by more CPUs in parallel.
>>
>> I am going to think more about it.
>
>I have created a POC using Gemini. I haven't tested it.
>But it looks acceptable. And the logic seems to be more
>straightforward.
>
>One drawback is that it requires adding the _reset()
>call for all sys_info() callers. It is fine in principle
>but it might complicate back-porting because all changes
>have to be done in one patch.
>
>But honestly, this is a nice to have fix. Most people could
>live happily without it.
>
>From 3c66436d9978030845a96bfaedd6b914536e2ac4 Mon Sep 17 00:00:00 2001
>From: Petr Mladek <pmladek@xxxxxxxx>
>Date: Fri, 26 Jun 2026 13:55:41 +0200
>Subject: [POC] sys_info: Introduce state-tracking APIs to prevent duplicate
> backtraces
>
>In watchdog, panic, and hung task detection scenarios, sys_info() can
>be called multiple times or alongside direct backtrace triggers like
>trigger_allbutcpu_cpu_backtrace(). This results in identical backtraces
>being dumped repeatedly from all CPUs, cluttering the kernel log and
>delaying or obscuring critical debug details.
>
>Introduce a state tracking bitmask and associated helpers:
>- sys_info_done(mask): Marks specific sys_info bits as already printed.
>- sys_info_reset(): Resets the tracking state.
>- sys_info_is_done(mask): Checks if all bits in the mask have been printed.
>
>Update sys_info() to automatically filter out already printed bits
>using this state. Integrate these APIs with the generic hardlockup
>and softlockup watchdogs, the PowerPC watchdog, the hung task detector,
>and the panic core. This ensures that each piece of system information
>and backtrace output is printed at most once per lockup/panic event,
>and the state is reset cleanly when a lockup does not trigger a panic.
>
>Races between sys_info() callers are ignored. It should be acceptable
>because the output from various watchdogs has never been synchronized.
>And panic() never returns.
>
>Assisted-by: gemini-1.5-flash ?

Why not use gemini 3.5 flash?

I can try if you want.

Could I have the prompt you used? :)

>Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
>---
> arch/powerpc/kernel/watchdog.c | 13 ++++++++++---
> include/linux/sys_info.h | 3 +++
> kernel/hung_task.c | 2 ++
> kernel/panic.c | 4 +++-
> kernel/watchdog.c | 10 ++++++++--
> lib/sys_info.c | 30 +++++++++++++++++++++++++++++-
> 6 files changed, 55 insertions(+), 7 deletions(-)
>
>diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
>index c40c69368476..0eab7894b9dc 100644
>--- a/arch/powerpc/kernel/watchdog.c
>+++ b/arch/powerpc/kernel/watchdog.c
>@@ -239,6 +239,7 @@ static void watchdog_smp_panic(int cpu)
> if (sysctl_hardlockup_all_cpu_backtrace ||
> (hardlockup_si_mask & SYS_INFO_ALL_BT)) {
> trigger_allbutcpu_cpu_backtrace(cpu);
>+ sys_info_done(SYS_INFO_ALL_BT);
> cpumask_clear(&wd_smp_cpus_ipi);
> } else {
> /*
>@@ -251,10 +252,12 @@ static void watchdog_smp_panic(int cpu)
> }
> }
>
>- sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT);
>+ sys_info(hardlockup_si_mask);
> if (hardlockup_panic)
> nmi_panic(NULL, "Hard LOCKUP");
>
>+ sys_info_reset();
>+
> wd_end_reporting();
>
> return;
>@@ -419,13 +422,17 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
> xchg(&__wd_nmi_output, 1); // see wd_lockup_ipi
>
> if (sysctl_hardlockup_all_cpu_backtrace ||
>- (hardlockup_si_mask & SYS_INFO_ALL_BT))
>+ (hardlockup_si_mask & SYS_INFO_ALL_BT)) {
> trigger_allbutcpu_cpu_backtrace(cpu);
>+ sys_info_done(SYS_INFO_ALL_BT);
>+ }
>
>- sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT);
>+ sys_info(hardlockup_si_mask);
> if (hardlockup_panic)
> nmi_panic(regs, "Hard LOCKUP");
>
>+ sys_info_reset();
>+
> wd_end_reporting();
> }
> /*
>diff --git a/include/linux/sys_info.h b/include/linux/sys_info.h
>index a5bc3ea3d44b..ad43548c75dd 100644
>--- a/include/linux/sys_info.h
>+++ b/include/linux/sys_info.h
>@@ -18,6 +18,9 @@
> #define SYS_INFO_BLOCKED_TASKS 0x00000080
>
> void sys_info(unsigned long si_mask);
>+void sys_info_done(unsigned long si_mask);
>+void sys_info_reset(void);
>+bool sys_info_is_done(unsigned long si_mask);
> unsigned long sys_info_parse_param(char *str);
>
> #ifdef CONFIG_SYSCTL
>diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>index 6fcc94ce4ca9..dbb6a27770f5 100644
>--- a/kernel/hung_task.c
>+++ b/kernel/hung_task.c
>@@ -354,6 +354,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>
> if (hung_task_call_panic)
> panic("hung_task: blocked tasks");
>+
>+ sys_info_reset();
> }
>
> static long hung_timeout_jiffies(unsigned long last_checked,
>diff --git a/kernel/panic.c b/kernel/panic.c
>index 213725b612aa..86ce17f03da2 100644
>--- a/kernel/panic.c
>+++ b/kernel/panic.c
>@@ -550,8 +550,10 @@ static void panic_trigger_all_cpu_backtrace(void)
> */
> static void panic_other_cpus_shutdown(bool crash_kexec)
> {
>- if (panic_print & SYS_INFO_ALL_BT)
>+ if ((panic_print & SYS_INFO_ALL_BT) && !sys_info_is_done(SYS_INFO_ALL_BT)) {
> panic_trigger_all_cpu_backtrace();
>+ sys_info_done(SYS_INFO_ALL_BT);
>+ }
>
> /*
> * Note that smp_send_stop() is the usual SMP shutdown function,
>diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>index 87dd5e0f6968..f431087c68a7 100644
>--- a/kernel/watchdog.c
>+++ b/kernel/watchdog.c
>@@ -282,14 +282,17 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
>
> if (hardlockup_all_cpu_backtrace) {
> trigger_allbutcpu_cpu_backtrace(cpu);
>+ sys_info_done(SYS_INFO_ALL_BT);
> if (!hardlockup_panic)
> clear_bit_unlock(0, &hard_lockup_nmi_warn);
> }
>
>- sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT);
>+ sys_info(hardlockup_si_mask);
> if (hardlockup_panic)
> nmi_panic(regs, "Hard LOCKUP");
>
>+ sys_info_reset();
>+
> per_cpu(watchdog_hardlockup_warned, cpu) = true;
> }
>
>@@ -895,16 +898,19 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>
> if (softlockup_all_cpu_backtrace) {
> trigger_allbutcpu_cpu_backtrace(smp_processor_id());
>+ sys_info_done(SYS_INFO_ALL_BT);
> if (!softlockup_panic)
> clear_bit_unlock(0, &soft_lockup_nmi_warn);
> }
>
> add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
>- sys_info(softlockup_si_mask & ~SYS_INFO_ALL_BT);
>+ sys_info(softlockup_si_mask);
> thresh_count = duration / get_softlockup_thresh();
>
> if (softlockup_panic && thresh_count >= softlockup_panic)
> panic("softlockup: hung tasks");
>+
>+ sys_info_reset();
> }
>
> return HRTIMER_RESTART;
>diff --git a/lib/sys_info.c b/lib/sys_info.c
>index f32a06ec9ed4..f8e6176fae75 100644
>--- a/lib/sys_info.c
>+++ b/lib/sys_info.c
>@@ -160,7 +160,35 @@ static void __sys_info(unsigned long si_mask)
> show_state_filter(TASK_UNINTERRUPTIBLE);
> }
>
>+static unsigned long sys_info_done_mask;
>+
>+void sys_info_done(unsigned long si_mask)
>+{
>+ sys_info_done_mask |= si_mask;
>+}
>+
>+void sys_info_reset(void)
>+{
>+ sys_info_done_mask = 0;
>+}
>+
>+bool sys_info_is_done(unsigned long si_mask)
>+{
>+ return (sys_info_done_mask & si_mask) == si_mask;
>+}
>+
> void sys_info(unsigned long si_mask)
> {
>- __sys_info(si_mask ? : kernel_si_mask);
>+ unsigned long mask;
>+
>+ if (si_mask)
>+ mask = si_mask & ~sys_info_done_mask;
>+ else
>+ mask = kernel_si_mask & ~sys_info_done_mask;
>+
>+ if (!mask)
>+ return;
>+
>+ __sys_info(mask);
>+ sys_info_done(mask);
> }
>

Thanks!