Re: [PATCH] printk: stop spining waiter when console resume to flush prb

From: Petr Mladek
Date: Thu May 06 2021 - 09:39:32 EST


On Thu 2021-05-06 16:00:26, Luo Jiaxing wrote:
> Some threads still call printk() for printing when resume_console() is
> being executed. In practice, the printk() is executed for a period of time
> and then returned. The duration is determined by the number of prints
> cached in the prb during the suspend/resume process. At the same time,
> resume_console() returns quickly.

The last sentence is a bit misleading. resume_console() returns
quickly only when @console_owner was passed to another process.


> Base on owner/waiter machanism, the frist one who fail to lock console will
> become waiter, and start spining. When current owner finish print one
> informance, if a waiter is waitting, owner will give up and let waiter
> become a new owner. New owner need to flush the whole prb unitl prb empty
> or another new waiter come and take the job from him.
>
> So the first waiter after resume_console() will take seconds to help to

It need not to be the first waiter. The console_lock owner might be passed
several times.

But you have a point. Many messages might get accumulated when the
console was suspended and any console_owner might spend a long time
processing them. resume_console() seems to be always called in
preemptible context, so it is safe to process all messages here.


> flush prb, but driver which call printk() may be bothered by this. New
> a flag to mark resume flushing prb. When the console resume, before the
> prb is empty, stop to set a new waiter temporarily.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -287,6 +287,9 @@ EXPORT_SYMBOL(console_set_on_cmdline);
> /* Flag: console code may call schedule() */
> static int console_may_schedule;
>
> +/* Flags: console flushing prb when resume */
> +static atomic_t console_resume_flush_prb = ATOMIC_INIT(0);
> +
> enum con_msg_format_flags {
> MSG_FORMAT_DEFAULT = 0,
> MSG_FORMAT_SYSLOG = (1 << 0),
> @@ -1781,7 +1784,8 @@ static int console_trylock_spinning(void)
> raw_spin_lock(&console_owner_lock);
> owner = READ_ONCE(console_owner);
> waiter = READ_ONCE(console_waiter);
> - if (!waiter && owner && owner != current) {
> + if (!waiter && owner && owner != current &&
> + !atomic_read(&console_resume_flush_prb)) {

atomic_set()/atomic_read() do not provide any memory barriers.
IMHO, the atomic operations are not enough to serialize @console_owner
and @console_resume_flush_prb manipulation.

See below.

> WRITE_ONCE(console_waiter, true);
> spin = true;
> }
> @@ -2355,6 +2359,7 @@ void resume_console(void)
> if (!console_suspend_enabled)
> return;
> down_console_sem();
> + atomic_set(&console_resume_flush_prb, 1);
> console_suspended = 0;
> console_unlock();
> }
> @@ -2592,6 +2597,8 @@ void console_unlock(void)
> raw_spin_unlock(&logbuf_lock);
>
> up_console_sem();
> + if (atomic_read(&console_resume_flush_prb))
> + atomic_set(&console_resume_flush_prb, 0);

This should be done under console_lock. Othwerwise,
it is not serialized at all.

Also there is one more return from console_unlock():

if (!can_use_console()) {
console_locked = 0;
up_console_sem();
return;
}

@console_resume_flush_prb must be cleared here as well.
Otherwise, the next random console_unlock() caller will not
be allowed to pass the console lock owner.


OK, the above patch tries to tell console_trylock_spinning()
that it should ignore console_owner even when set.
@console_resume_flush_prb variable is set/read by different
processes in parallel which makes it complicated.

Instead, we should simply tell console_unlock() that it should not
set console_owner in this case. The most strightforward
way is to pass this via parameter.

Such console_unlock() might be used even on another locations
with preemptible context.


What about the following patch?