Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic

From: Petr Mladek
Date: Tue Aug 13 2024 - 09:45:34 EST


On Mon 2024-08-12 16:29:31, takakura@xxxxxxxxxxxxx wrote:
> From: Ryo Takakura <takakura@xxxxxxxxxxxxx>
>
> After panic, non-panicked CPU's has been unable to flush ringbuffer
> while they can still write into it. This can affect CPU backtrace
> triggered in panic only able to write into ringbuffer incapable of
> flushing them.

I still think about this. The motivation is basically the same
as in the commit 5d5e4522a7f404d1a96fd ("printk: restore flushing
of NMI buffers on remote CPUs after NMI backtraces").

It would make sense to replace printk_trigger_flush() with
console_try_flush(). And the function should queue the irq
work when it can't be flushed directly, see below.

> Fix the issue by letting the panicked CPU handle the flushing of
> ringbuffer right after non-panicked CPUs finished writing their
> backtraces.
>
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -260,6 +260,7 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
> panic_triggering_all_cpu_backtrace = true;
> trigger_all_cpu_backtrace();
> panic_triggering_all_cpu_backtrace = false;
> + console_try_flush();
> }
>
> /*
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3284,6 +3284,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
> console_flush_all(false, &next_seq, &handover);
> }
>
> +/**
> + * console_try_flush - try to flush consoles when safe
> + *
> + * Context: Any, except for NMI.

It is safe even in NMI. is_printk_legacy_deferred() would return true
in this case.

> + */
> +void console_try_flush(void)
> +{
> + if (is_printk_legacy_deferred())
> + return;
> +
> + if (console_trylock())
> + console_unlock();
> +}

I would do something like:

/**
* console_try_or_trigger_flush - try to flush consoles directly when
* safe or the trigger deferred flush.
*
* Context: Any
*/
void console_try_or_trigger_flush(void)
{
if (!is_printk_legacy_deferred() && console_trylock())
console_unlock();
else
defer_console_output();
}

and use it instead of printk_trigger_flush() in
nmi_trigger_cpumask_backtrace().


Well, I would postpone this patch after we finalize the patchset
adding con->write_atomic() callback. This patch depends on it anyway
via is_printk_legacy_deferred(). The patchset might also add
other wrappers for flushing consoles and we have to choose some
reasonable names. Or John could integrate this patch into the patchset.

Best Regards,
Petr