Re: [BUG] 8e274732115f ("printk: extend console_lock for per-console locking")

From: John Ogness
Date: Sun Jun 12 2022 - 12:03:18 EST


Hi Paul,

Thanks for looking into this! I am currently on vacation with family, so
my responses are limited. Some initial comments from me below...

On 2022-06-12, "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:
> And the patch below takes care of things in (admittedly quite light)
> testing thus far. What it does is add ten seconds of pure delay
> before rcutorture shuts down the system. Presumably, this delay gives
> printk() the time that it needs to flush its buffers. In the
> configurations that I have tested thus far, anyway.
>
> So what should I be doing instead?
>
> o console_flush_on_panic() seems like strong medicine, but might
> be the right thing to do. The bit about proceeding even though
> it failed to acquire the lock doesn't look good for non-panic
> use.

For sure not this one.

> o printk_trigger_flush() has an attractive name, but it looks
> like it only just starts the flush rather than waiting for it
> to finish.

Correct. It just triggers.

> o pr_flush(1000, true) looks quite interesting, and also seems to
> work in a few quick tests, so I will continue playing with that.

This is only useful if the context is guaranteed may_sleep().

What is _supposed_ to happen is that @system_state increases above
SYSTEM_RUNNING, which then causes direct printing to be used. So the
pr_emerg("Power down\n") in kernel_power_off() would directly flush all
remaining messages.

But if the threaded printers are already in the process of printing,
they block direct printing. That may be what we are seeing here.

What I find particularly interesting is that it is not the kthread-patch
that is causing the issue.

I will have some time tonight to take a closer look.

John Ogness