Re: [PATCH printk v2 1/8] printk: Add non-BKL (nbcon) console basic infrastructure

From: Petr Mladek
Date: Tue Aug 01 2023 - 06:35:29 EST


On Mon 2023-07-31 22:45:00, John Ogness wrote:
> On 2023-07-31, Petr Mladek <pmladek@xxxxxxxx> wrote:
> >>> #define have_serialized_console (have_legacy_console || have_boot_console)
> >>
> >> This macro is not about having a serialized console. The first
> >> sentence in the comment describes it best. It is just to signal if we
> >> need to do the console lock/unlock dance to generate console output.
> >>
> >> Something like "need_bkl_dance" would be a better name, but it
> >> doesn't sound very technical.
> >
> > I see.
> >
> > Question: Will console_lock() serialize the early-boot handling
> > of non-BKL conosles? I mean the direct flush in vprintk_emit().
>
> Initially there will be no nbcon consoles that support CON_BOOT. This
> means that there are no nbcon consoles in "early boot". The only reason
> that nbcon consoles and legacy boot consoles would co-exist (aside from
> the brief moment before boot consoles are unregistered) is if
> keep_bootcon is used.
>
> As long as a boot console is registered, nbcon consoles are also bound
> to console_lock() serialization. We have no choice until we can safely
> link boot consoles to regular consoles.
>
> I think this will be ok for the first release. The 8250 will not become
> less reliable in early boot. And once the boot console is unregistered,
> the 8250 nbcon console will be able to fly.

Makes sense. Thanks a lot for explanation.

> > At lest, the v1 patch set called cons_atomic_flush() in vprintk_emit()
> > without taking outside console_lock().
>
> Yes. But in the v1 patch set, console_is_usable() returns false for
> nbcon consoles if there is a boot console registered. So the
> cons_atomic_flush() in vprintk_emit() would not end up printing
> anything.
>
> As per your v1 feedback, that check will no longer be in
> console_is_usable(), but instead will be further out in higher level
> code.

I see.

> We have 3 scenarios that I would like to easily identify using global
> variable(s).
>
> 1. There are only nbcon consoles. The console lock never needs to be
> taken.
>
> 2. There are nbcon consoles and regular legacy consoles. The console
> lock must be taken to serialize only the regular legacy consoles. There
> are separate code paths (without the console lock) that will take care
> of nbcon atomic printing and nbcon threaded printing.
>
> 3. There are nbcon consoles and boot consoles. The console lock must be
> taken to serialize all consoles.
>
> Perhaps rather than using 2 booleans and a macro, we just use a single
> atomic_t that describes the console serialization mode? The effect is
> the same, but maybe it makes the intention of the code a bit easier to
> understand?
>
> SERMOD_BOOTCON = 0,
> SERMOD_WITH_LEGACY = 1,
> SERMOD_ONLY_NBCON = 2,

IMHO, this is not ideal. Most locations would need to do the console
lock/unlock dance in both '0' and '1' mode. It can be solved by
"sermode <= SERMOD_WITH_LEGACY" but then it would not be clear
what are the modes below '1'.

> Or maybe describe the modes based on their behavior rather than their
> condition:
>
> SERMOD_ONLY_CONSOLE_LOCK = 0,
> SERMOD_ALSO_CONSOLE_LOCK = 1,
> SERMOD_NO_CONSOLE_LOCK = 2,

The might be more practical. But I think that the original variables
were better after all. Well, what about renaming the macro

#define need_legacy_console_flush (have_legacy_console || have_boot_console)

or

#define need_console_lock (have_legacy_console || have_boot_console)

I personally prefer "need_legacy_console_flush". Well, I am not sure
if it would fit all use cases.

> >>> + Update console_flush_all() description. Mention that it flushes
> >>> only serialized consoles
> >>
> >> Agreed. It is only responsible for bkl dance flushing.
> >
> > Will it flush only legacy consoles? Or will it flush also non-BKL
> > consoles during the early boot?
>
> It will also flush nbcon consoles if a boot console is registered.
>
> > I think that it is wrong even after adding the nbcon check. The code
> > looks like this at the end of this patchset:
> >
> > /*
> > * If consoles are not usable, it cannot be expected
> > * that they make forward progress, so only increment
> > * @diff for usable consoles.
> > */
> > if (!console_is_usable(c))
> > continue;
> >
> > if (flags & CON_NBCON)
> > printk_seq = nbcon_seq_read(c);
> > else if (locked)
> > printk_seq = c->seq;
> > else
> > continue;
> >
> > I guess that the "else-continue" path will never happen. But when
> > it happens then pr_flush() would ignore a usable console and it looks
> > wrong.
>
> My reason for keeping the "if locked" was to remind the developer that
> the console lock must be held in order to safely read @console->seq.

I see.

> But you are right that it makes things look awkward. I will just change the
> code to:
>
> if (flags & CON_NBCON)
> printk_seq = nbcon_seq_read(c);
> else
> printk_seq = c->seq;
>
> There is already a comment at the console_lock() explaining why it is
> taken. That is enough.

This looks better to me.

Best Regards,
Petr