preffer_ofload param: was: Re: [PATCH printk v3 11/19] printk: nbcon: Rely on kthreads for normal operation

From: Petr Mladek
Date: Wed Jul 31 2024 - 09:46:29 EST


On Mon 2024-07-22 19:25:31, John Ogness wrote:
> Once the kthread is running and available
> (i.e. @printk_kthreads_running is set), the kthread becomes
> responsible for flushing any pending messages which are added
> in NBCON_PRIO_NORMAL context. Namely the legacy
> console_flush_all() and device_release() no longer flush the
> console. And nbcon_atomic_flush_pending() used by
> nbcon_cpu_emergency_exit() no longer flushes messages added
> after the emergency messages.
>
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
> @@ -190,11 +190,13 @@ extern bool legacy_allow_panic_sync;
> /**
> * struct console_flush_type - Define how to flush the consoles
> * @nbcon_atomic: Flush directly using nbcon_atomic() callback
> + * @nbcon_offload: Offload flush to printer thread
> * @legacy_direct: Call the legacy loop in this context
> * @legacy_offload: Offload the legacy loop into IRQ
> */
> struct console_flush_type {
> bool nbcon_atomic;
> + bool nbcon_offload;
> bool legacy_direct;
> bool legacy_offload;
> };
> @@ -220,7 +222,9 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft,
> ft->legacy_direct = true;
> }
>
> - if (have_nbcon_console && !have_boot_console)
> + if (printk_kthreads_running)
> + ft->nbcon_offload = true;
> + else if (have_nbcon_console && !have_boot_console)
> ft->nbcon_atomic = true;
> break;
>
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index 233ab8f90fef..8cf9e9e8c6e4 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1511,10 +1511,10 @@ static void nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>
> /*
> * If flushing was successful but more records are available, this
> - * context must flush those remaining records because there is no
> - * other context that will do it.
> + * context must flush those remaining records if the printer thread
> + * is not available do it.
> */
> - printk_get_console_flush_type(&ft, false);
> + printk_get_console_flush_type(&ft, true);

Hmm, it is a bit weird that we change the value even though it does
not affect the behavior. The parameter @prefer_offload affects only
the legacy loop.

It is even more confusing because ft.legacy_offload and
ft.nbcon_offload have difference motivation:

+ legacy_offload is used when the direct flush is not allowed/possible
+ nbcon_offload is used when allowed/possible

I am not 100% sure why I added the @prefer_offload parameter. I think
that it was for situations when we wanted to call the legacy loop
independently on whether direct or offload was chosen. I think
that it was for __wake_up_klogd() called from
defer_console_output().

Anyway, this is cscope output in emacs after applying this patchset:

<paste>
Finding symbol: printk_get_console_flush_type

Database directory: /prace/kernel/linux-printk/
-------------------------------------------------------------------------------
*** kernel/printk/internal.h:
printk_get_console_flush_type[225] static inline void printk_get_console_flush_type(struct console_flush_type *ft, bool prefer_offload)

*** kernel/printk/nbcon.c:
nbcon_atomic_flush_pending_con[1548] printk_get_console_flush_type(&ft, true);
nbcon_device_release[1848] printk_get_console_flush_type(&ft, true);

*** kernel/printk/printk.c:
printk_legacy_allow_panic_sync[2343] printk_get_console_flush_type(&ft, false);
vprintk_emit[2381] printk_get_console_flush_type(&ft, false);
resume_console[2706] printk_get_console_flush_type(&ft, true);
console_cpu_notify[2729] printk_get_console_flush_type(&ft, false);
console_flush_all[3102] printk_get_console_flush_type(&ft, true);
console_unlock[3223] printk_get_console_flush_type(&ft, false);
console_flush_on_panic[3375] printk_get_console_flush_type(&ft, false);
console_start[3453] printk_get_console_flush_type(&ft, true);
legacy_kthread_should_wakeup[3484] printk_get_console_flush_type(&ft, true);
__pr_flush[4290] printk_get_console_flush_type(&ft, false);
__pr_flush[4302] printk_get_console_flush_type(&ft, false);
__wake_up_klogd[4466] printk_get_console_flush_type(&ft, true);
console_try_replay_all[4851] printk_get_console_flush_type(&ft, true);
-------------------------------------------------------------------------------
</paste>

Now, the parameter @prefer_offload makes a difference only
when it is set to "true" and "ft.legacy_offload" value is
later used. It reduces the above list to:

*** kernel/printk/printk.c:
resume_console[2706] printk_get_console_flush_type(&ft, true);
console_start[3453] printk_get_console_flush_type(&ft, true);
__wake_up_klogd[4466] printk_get_console_flush_type(&ft, true);
console_try_replay_all[4851] printk_get_console_flush_type(&ft, true);

IMHO, __wake_up_klogd() is the only location where we really need
to know if there are any messages for the legacy loop, for example,
when called from printk_deferred().

It should not be needed in other situations because it there
is always __pr_flush() or console_unlock() which would flush
the legacy consoles directly anyway.

=> I suggest to

1. Remove @prefer_offload parameter from printk_get_console_flush_type

2. Update __wake_up_klogd() to check both ft.legacy_offload and
ft.legacy_direct, like:

printk_get_console_flush_type(&ft);
if (!ft.legacy_offload && !ft.legacy_direct)
val &= ~PRINTK_PENDING_OUTPUT;


NOTE: I actually suggested to use in vprintk_emit():

printk_get_console_flush_type(&ft, deffer_legacy);

But it can be done even without this parameter. Like it
is done in this version of the patchset.

Best Regards,
Petr