Re: [PATCH v4] panic: Avoid the extra noise dmesg

From: Feng Tang
Date: Fri Feb 15 2019 - 00:55:23 EST


Hi all,

On Tue, Dec 11, 2018 at 09:32:30AM +0100, Petr Mladek wrote:
> On Mon 2018-12-10 10:49:22, Kees Cook wrote:
> > On Mon, Dec 10, 2018 at 10:17 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > On Fri, 7 Dec 2018 17:51:19 +0800
> > > Feng Tang <feng.tang@xxxxxxxxx> wrote:
> > >
> > > > When kernel panic happens, it will first print the panic call stack,
> > > > then the ending msg like:
> > > >
> > > > [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception
> > > > [ 35.749975] ------------[ cut here ]------------
> > > >
> > > > The above message are very useful for debugging.
> > > >
> > > > But if system is configured to not reboot on panic, say the "panic_timeout"
> > > > parameter equals 0, it will likely print out many noisy message like
> > > > WARN() call stack for each and every CPU except the panic one, messages
> > > > like below:
> > >
> > >
> > > > Keeping the interrupt disabled will avoid the noisy message.
> > > >
> > > > When code runs to this point, it means user has chosed to not reboot, or
> > > > do any special handling by using the panic notifier method, the only reason
> > > > to enable the interrupt may be sysrq migic key and panic_blink function
> > > > (though may not work even with irq enabled).
> > > >
> > > > So make the irq disabled by default and add a cmdline parameter
> > > > "panic_keep_irq_on" to turn it on when needed.
> > > >
> > > > Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
> > > >
> > >
> > > Acked-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> >
> > I'm fine with the new boot param, but I think we need to leave it how
> > it was by default: systems that want to see the blinking aren't going
> > to have a screen to read about what boot param they need to add.
> > Currently, we'll blink and spew extra warnings. With this patch we'll
> > not blink and not spew: a headless machine will have no indication
> > that a panic happened.
>
> Just to be sure, in case, you did not follow the other long
> discussion.
>
> I had an alternative approach to switch printk() into nop at
> this panic() stage. It can be restored when sysrq is triggered.
> This approach allows to keep blinking and sysrq working. People
> debugging the blinking would need to block this change but it
> should be rare.
>
> I do not resist on my solution. It looks a bit hacky. But it
> is simple and rather straightforward.

Sorry for the late response.

So currently, there are 2 proposals:
1. this v4 patch of "panic_keep_irq_on" flag (default off to be same
as the current kernel behavior)
2. Petr's suggestion of adding a flag to suppress printk after enterring
late panic phase (blinking time), while keeping the sysrq printk
working.

Following is the draft patch based on Petr's suggestion:

Please review, thanks. I'm fine with both solutions.

- Feng

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 1f03078..8921fed 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -528,6 +528,11 @@ void __handle_sysrq(int key, bool check_mask)
struct sysrq_key_op *op_p;
int orig_log_level;
int i;
+ int old_val;
+
+ /* save the old panic printk flag */
+ old_val = panic_suppress_printk;
+ panic_suppress_printk = 1;

rcu_sysrq_start();
rcu_read_lock();
@@ -574,6 +579,8 @@ void __handle_sysrq(int key, bool check_mask)
}
rcu_read_unlock();
rcu_sysrq_end();
+
+ panic_suppress_printk = old_val;
}

void handle_sysrq(int key)
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 8f0e68e..4120f3a 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -534,6 +534,7 @@ extern int panic_on_io_nmi;
extern int panic_on_warn;
extern int sysctl_panic_on_rcu_stall;
extern int sysctl_panic_on_stackoverflow;
+extern int panic_suppress_printk;

extern bool crash_kexec_post_notifiers;

diff --git a/kernel/panic.c b/kernel/panic.c
index f121e6b..0cd3a1b 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -46,6 +46,8 @@ int panic_on_warn __read_mostly;
int panic_timeout = CONFIG_PANIC_TIMEOUT;
EXPORT_SYMBOL_GPL(panic_timeout);

+int panic_suppress_printk;
+
#define PANIC_PRINT_TASK_INFO 0x00000001
#define PANIC_PRINT_MEM_INFO 0x00000002
#define PANIC_PRINT_TIMER_INFO 0x00000004
@@ -326,6 +328,7 @@ void panic(const char *fmt, ...)
}
#endif
pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
+ panic_suppress_printk = 1;
local_irq_enable();
for (i = 0; ; i += PANIC_TIMER_STEP) {
touch_softlockup_watchdog();
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index d3d1703..c27bbf5 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1987,6 +1987,9 @@ asmlinkage __visible int printk(const char *fmt, ...)
va_list args;
int r;

+ if (unlikely(panic_suppress_printk))
+ return 0;
+
va_start(args, fmt);
r = vprintk_func(fmt, args);
va_end(args);