Re: Removal of printk safe buffers delays NMI context printk

From: Nicholas Piggin
Date: Fri Nov 05 2021 - 19:57:57 EST


Excerpts from Petr Mladek's message of November 6, 2021 2:23 am:
> On Fri 2021-11-05 15:03:27, John Ogness wrote:
>> On 2021-11-05, Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
>> >> What was removed from 93d102f094b was irq_work triggering on all
>> >> CPUs.
>> >
>> > No, it was the caller executing the flush for all remote CPUs itself.
>> > irq work was not involved (and irq work can't be raised in a remote
>> > CPU from NMI context).
>>
>> Maybe I am missing something. In 93d102f094b~1 I see:
>>
>> watchdog_smp_panic
>> printk_safe_flush
>> __printk_safe_flush
>> printk_safe_flush_buffer
>> printk_safe_flush_line
>> printk_deferred
>> vprintk_deferred
>> vprintk_emit (but no direct printing)
>> defer_console_output
>> irq_work_queue
>>
>> AFAICT, using defer_console_output() instead of your new printk_flush()
>> should cause the exact behavior as before.
>
> I agree. printk_safe_flush() used printk_deferred(). It only queued
> the irq_work and never called consoles directly.
>
>> > but we do need that printk flush capability back there and for
>> > nmi_backtrace.
>>
>> Agreed. I had not considered this necessary side-effect when I removed
>> the NMI safe buffers.
>
> Honestly, I do not understand why it stopped working or how
> it worked before.
>
> printk() calls vprintk(). Current vprintk() does:
>
> asmlinkage int vprintk(const char *fmt, va_list args)
> {
> [...]
> /*
> * Use the main logbuf even in NMI. But avoid calling console
> * drivers that might have their own locks.
> */
> if (this_cpu_read(printk_context) || in_nmi()) {
> int len;
>
> len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args);
> defer_console_output();
> return len;
> }
>
> /* No obstacles. */
> return vprintk_default(fmt, args);
> }
>
> By other words, current vprintk():
>
> + queues irq_work() in NMI context
> + tries to flush consoles immeditely otherwise
>
>
>> I am just wondering if we should fix the regression by going back to
>> using irq_work (such as defer_console_output()) or if we want to
>> introduce something new that introduces direct printing.
>
> Yup, defer_console_output() should do the same as printk_safe_flush()
> before. We do not longer need to copy the messages because they are
> already in the main lockless log buffer.
>
> Well, I am curious about the original code. The commit 93d102f094be9beab28e5
> ("printk: remove safe buffers") did the following:
>
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index c9a8f4781a10..dc17d8903d4f 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -183,11 +183,6 @@ static void watchdog_smp_panic(int cpu, u64 tb)
>
> wd_smp_unlock(&flags);
>
> - printk_safe_flush();
> - /*
> - * printk_safe_flush() seems to require another print
> - * before anything actually goes out to console.
> - */
> if (sysctl_hardlockup_all_cpu_backtrace)
> trigger_allbutself_cpu_backtrace();
>
> And I am curious because:
>
> + Why was printk_safe_flush() called before triggering backtraces
> on other CPUs?

I commented about that earlier in the thread. The flush is for a
previous non-maskable IPI. trigger_allbutself_cpu_backtrace is not NMI
on powerpc it's just regular irq context so if they did respond to it
they would also be able to run the irq work.

> + The comment says that another print is needed before the messages
> goes to the console. It makes sense because printk_safe_flush()
> only set irq_work. But the patch did not remove any printk().
> So, nobody called any printk() even before.

I think that was my misunderstanding as to the problem. The NMI IPI is
actually asynchronous so it was not guaranteed that targets had
completed their backtraces at this point. powerpc watchdog needs a
separate fix (after which it seems to work fine with printk_safe_flush)

Thanks,
Nick