READ_ONCE: was: Re: [PATCH printk v2 19/26] printk: Avoid console_lock dance if no legacy or boot consoles

From: Petr Mladek
Date: Thu Feb 29 2024 - 11:24:55 EST


Adding Paul into Cc.

On Sun 2024-02-18 20:03:19, John Ogness wrote:
> Currently the console lock is used to attempt legacy-type
> printing even if there are no legacy or boot consoles registered.
> If no such consoles are registered, the console lock does not
> need to be taken.
>
> Add tracking of legacy console registration and use it with
> boot console tracking to avoid unnecessary code paths, i.e.
> do not use the console lock if there are no boot consoles
> and no legacy consoles.
>
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
> @@ -44,6 +44,16 @@ enum printk_info_flags {
> };
>
> extern struct printk_ringbuffer *prb;
> +extern bool have_legacy_console;
> +extern bool have_boot_console;
> +
> +/*
> + * Specifies if the console lock/unlock dance is needed for console
> + * printing. If @have_boot_console is true, the nbcon consoles will
> + * be printed serially along with the legacy consoles because nbcon
> + * consoles cannot print simultaneously with boot consoles.
> + */
> +#define printing_via_unlock (have_legacy_console || have_boot_console)
>
> __printf(4, 0)
> int vprintk_store(int facility, int level,
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -463,6 +463,13 @@ static int console_msg_format = MSG_FORMAT_DEFAULT;
> /* syslog_lock protects syslog_* variables and write access to clear_seq. */
> static DEFINE_MUTEX(syslog_lock);
>
> +/*
> + * Specifies if a legacy console is registered. If legacy consoles are
> + * present, it is necessary to perform the console_lock/console_unlock dance
> + * whenever console flushing should occur.
> + */
> +bool have_legacy_console;
> +
> /*
> * Specifies if a boot console is registered. If boot consoles are present,
> * nbcon consoles cannot print simultaneously and must be synchronized by
> @@ -3790,22 +3807,28 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> seq = prb_next_reserve_seq(prb);
>
> /* Flush the consoles so that records up to @seq are printed. */
> - console_lock();
> - console_unlock();
> + if (printing_via_unlock) {
> + console_lock();
> + console_unlock();
> + }
>
> for (;;) {
> unsigned long begin_jiffies;
> unsigned long slept_jiffies;
>
> + locked = false;
> diff = 0;
>
> - /*
> - * Hold the console_lock to guarantee safe access to
> - * console->seq. Releasing console_lock flushes more
> - * records in case @seq is still not printed on all
> - * usable consoles.
> - */
> - console_lock();
> + if (printing_via_unlock) {
> + /*
> + * Hold the console_lock to guarantee safe access to
> + * console->seq. Releasing console_lock flushes more
> + * records in case @seq is still not printed on all
> + * usable consoles.
> + */
> + console_lock();
> + locked = true;
> + }
>
> cookie = console_srcu_read_lock();
> for_each_console_srcu(c) {
> @@ -3836,7 +3860,8 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> if (diff != last_diff && reset_on_progress)
> remaining_jiffies = timeout_jiffies;
>
> - console_unlock();
> + if (locked)
> + console_unlock();

Is this actually safe?

What prevents the compiler from optimizing out the "locked" variable
and reading "printing_via_unlock" once again here?

It is not exactly the same but it is similar to "invented loads"
described at
https://lwn.net/Articles/793253/#Invented%20Loads

The writes affecting printing_via_unlock are not synchronized
by console_lock().

Should we do the following?

/*
* Specifies if the console lock/unlock dance is needed for console
* printing. If @have_boot_console is true, the nbcon consoles will
* be printed serially along with the legacy consoles because nbcon
* consoles cannot print simultaneously with boot consoles.
*
* Prevent compiler speculations when checking the values.
*/
#define printing_via_unlock (READ_ONCE(have_legacy_console) || \
READ_ONCE(have_boot_console))


or

if (printing_via_unlock) {
[...]
WRITE_ONCE(locked, true);
}

Or am I too paranoid?

Best Regards,
Petr