Re: [PATCH printk v2 04/18] printk: nbcon: Introduce printing kthreads

From: Petr Mladek
Date: Wed Jun 12 2024 - 05:25:15 EST


On Wed 2024-06-12 10:57:11, John Ogness wrote:
> On 2024-06-11, Petr Mladek <pmladek@xxxxxxxx> wrote:
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> The thread failing to start is a serious issue. Particularly for
> >> PREEMPT_RT.
> >
> > I agree.
> >
> >> Probably it should be something like:
> >>
> >> if (WARN_ON(IS_ERR(kt))) {
> >
> > Might make sense.
>
> I will add this for v2.
>
> > Honestly, if the system is not able to start the kthread then
> > it is probably useless anyway. I would prefer if printk keeps working
> > so that people know what is going on ;-)
>
> OK. For v2 I will change it to fallback to the legacy printing for those
> consoles that do not have a kthread.
>
> > After all, I would add two comments, like these:
> >
> > <proposal-2>
> > /*
> > * Any access to the console device is serialized either by
> > * device_lock() or console context or both.
> > */
> > kt = kthread_run(nbcon_kthread_func, con, "pr/%s%d", con->name,
> > con->index);
> > [...]
> >
> > /*
> > * Some users check con->kthread to decide whether to flush
> > * the messages directly using con->write_atomic(). But they
> > * do so only when the console is already in @console_list.
> > */
>
> I do not understand how @console_list is related to racing between
> non-thread and thread. kthreads are not only created during
> registration. For example, they can be created much later when the last
> boot console unregisters.

I had in mind two particular code paths:

1. The check of con->kthread in nbcon_device_release() before
calling __nbcon_atomic_flush_pending_con().

But it is called only when __uart_port_using_nbcon() returns true.
And it would fail when nbcon_kthread_create() is called because

checks hlist_unhashed_lockless(&up->cons->node)

would fail. Which checks of the console is in @console_list


2. The following check in console_flush_all()

if ((flags & CON_NBCON) && con->kthread)
continue;

The result affects whether the legacy flush would call
nbcon_legacy_emit_next_record().

But this is called only for_each_console_srcu(con)
=> it could not race with nbcon_kthread_create()
because this console is not in @console_list at this moment.

By other words, I was curious whether some other code paths might
call con->write_atomic() while the kthread is already running.

It is not that important because it would be safe anyway.
I was checking this before I realized that it would be safe.

Anyway, the information about that the console is not in @console_list
when we set con->kthread still looks useful. At minimum,
the check would be racy if the console was on the list.

Does it make any sense now?

> I am OK with the first comment of this proposal. I do not understand the
> second comment.

Feel free to propose another comment. Or you could ignore the proposal
if you think that it does more harm than good.

Best Regards,
Petr