Re: [PATCH] printk/console: Check consistent sequence number when handling race in console_unlock()

From: Sergey Senozhatsky
Date: Tue Jun 29 2021 - 11:42:23 EST


On (21/06/29 16:33), Petr Mladek wrote:
> The standard printk() tries to flush the message to the console
> immediately. It tries to take the console lock. If the lock is
> already taken then the current owner is responsible for flushing
> even the new message.
>
> There is a small race window between checking whether a new message is
> available and releasing the console lock. It is solved by re-checking
> the state after releasing the console lock. If the check is positive
> then console_unlock() tries to take the lock again and process the new
> message as well.
[..]
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 142a58d124d9..87411084075e 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2545,6 +2545,7 @@ void console_unlock(void)
> bool do_cond_resched, retry;
> struct printk_info info;
> struct printk_record r;
> + u64 next_seq;
>
> if (console_suspended) {
> up_console_sem();
> @@ -2654,8 +2655,10 @@ void console_unlock(void)
> cond_resched();
> }
>
> - console_locked = 0;
> + /* Get consistent value of the next-to-be-used sequence number. */
> + next_seq = console_seq;
>
> + console_locked = 0;
> up_console_sem();
>
> /*
> @@ -2664,7 +2667,7 @@ void console_unlock(void)
> * there's a new owner and the console_unlock() from them will do the
> * flush, no worries.
> */
> - retry = prb_read_valid(prb, console_seq, NULL);
> + retry = prb_read_valid(prb, next_seq, NULL);
> printk_safe_exit_irqrestore(flags);
>
> if (retry && console_trylock())

Maybe it's too late here in my time zone, but what are the consequences
of this race?

`retry` can be falsely set, console_trylock() does not spin on owner,
so the context that just released the lock can grab it again only if
it's unlocked. For the context that just has released the console_sem
and then acquired it again, because of the race, - console_seq will be
valid after it acquires the lock, then it'll jump to `retry` and
re-validated the console_seq - prb_read_valid(). If it's valid, it'll
print the message; and should another CPU printk that CPU will spin on
owner and then the current console_sem owner will yield to it via
console_lock_spinning branch.

One way or the other, good catch and nice to have it fixed.

Acked-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>