Re: [BUG] 8e274732115f ("printk: extend console_lock for per-console locking")

From: Petr Mladek
Date: Mon Jun 13 2022 - 07:06:06 EST


On Mon 2022-06-13 11:10:19, John Ogness wrote:
> On 2022-06-12, "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:
> >> As I suspected, the final printk's cannot direct print because the
> >> kthread was printing. Using the below patch did seem to address your
> >> problem. But this is probably not the way forward.
> >
> > When I apply it, I still lose output, perhaps due to different timing?
> > Doing the pr_flush(1000, true) just before the call to kernel_power_off()
> > has been working quite well thus far, though.
>
> Your pr_flush() is appropriate for your RCU tests, but this is a problem
> in general that needs to be addressed. I suppose we should start a new
> thread for that. ;-)
>
> During development we experimented with the idea of kthreads pausing
> themselves whenever direct printing is activated. It was racey because
> there are situations when direct printing is only temporarily active and
> it was hard to coordinate who prints when direct printing becomes
> inactive again. So we dropped that idea. However, in this situation the
> system will not be disabling direct printing.
>
> @Paul, can you try the below change instead? Until this has been
> officially solved, you probably want to keep your pr_flush()
> solution. (After all, that is exactly what pr_flush() is for.) But it
> would be helpful if you could run this last test for us.
>
> @Petr, I like the idea of the kthreads getting out of the way rather
> than trying to direct print themselves (for this situation). It still
> isn't optimal because that final pr_emerg("Power down\n") might come
> before the kthread has finished its current line. But in that case the
> kthread may not have much a chance to finish the printing anyway.

I wonder if we could somehow combine it with pr_flush(timeout).

The kthread might bail-out when pr_flush() is running. It will
know that someone would continue printing. The timeout might
help to avoid a deadlock. We could somehow reuse
console_trylock_spinning() code here.


> John Ogness
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ea3dd55709e7..45c6c2b0b104 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3729,7 +3729,9 @@ static bool printer_should_wake(struct console *con, u64 seq)
> return true;
>
> if (con->blocked ||
> - console_kthreads_atomically_blocked()) {
> + console_kthreads_atomically_blocked() ||
> + system_state > SYSTEM_RUNNING ||
> + oops_in_progress) {
> return false;
> }

Also this is an interesting idea. We know that panic() will try
to flush the messages. Well, panic() is not always called
after Oops.

Best Regards,
Petr