Re: [GIT PULL] printk for 6.11

From: Linus Torvalds
Date: Thu Jul 25 2024 - 12:49:41 EST


On Thu, 25 Jul 2024 at 05:52, Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> I am afraid that we have to live with some buffering. Otherwhise,
> the speed of the system might be limited by the speed of the consoles.
> This might be especially noticeable during boot when a lot of HW
> gets initialized and tons of messages are flushed to a slow serial console.

Oh, I don't mind buffering during *normal* operations. Particularly
not for some slow serial line.

It's literally just oopses and KERN_ERR and similar that I think are special.

And I do think some consoles are more special than others.

"I want to buffer because serial lines are slow" does not mean "we
should always buffer" when realistically not everybody has a serial
line.

> Just for record. The idea of "buffering in emergency" came up
> in the opposite scenario:
>
> <flood of messages>
>
> CPU 0 CPU 1
>
> WARN()
> printk()
> flush_consoles()
> # handling long backlog
>
> panic()
> printk()
> flush_consoles()
> # successfully took over the lock
> # and continued flushing the backlog
>
> Result: CPU 0 never printed the rest of the WARN()

First off, I do think that's fine. This is - by definition - not a
normal situation, and a panic() is *way* more important than some
WARN.

Yes, they may obviously be related, but at the same time, if you
panic, and particularly if you have reboot-on-panic, you damn well get
what you asked for: other things *will* be hidden by the panic. Tough
luck.

There's a very real reason why I tell people that using "BUG_ON()" is
not ok for things where you can just return an error.

And there's also a very real reason why I think people who do
reboot-on-panic get to keep both pieces. It's self-inflicted damage,
and if they come crying to you, tell them so.

What you should *really* take away from this is

(a) you fundamentally can't handle all situations.

We are - by definition - talking catastrophic kernel bugs, and
something unexpected went very very wrong,

You can always make up some case that won't work, and you NEED TO REALIZE THAT.

(b) that means that you have to prioritize what you *DO* handle.

And I'm telling you that what needs to be prioritized is a oops (not a
warning), but particularly the *first* one.

Now, unrelated to that, I'm also claiming that the problem you
actually talk about is at least partially caused *by* the excessive
buffering. The whole "long backlog" should never happen, and never be
considered normal behavior.

So I think we also tend to have a behavioral problem, in that our
default console loglevel is too high. It's high by default, and I
suspect some users do console=verbose, which sets

console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;

which does exactly what it looks like it would do (or, more commonly,
the debug case that "only" sets it to CONSOLE_LOGLEVEL_DEBUG, which is
effectively the same thing in practice - I don't think we have any
users that actually use log-levels past KERN_DEBUG).

The thing is, if you have CONSOLE_LOGLEVEL_DEBUG set, and a big
machine, your bootup *will* be printing a lot of data.

And it *will* be slow over a serial console, particularly one that
runs at some historical speed because that's how a lot of these silly
things work.

But dammit, we literally have a "buffer messages" mode. It's this:

static bool suppress_message_printing(int level)
{
return (level >= console_loglevel && !ignore_loglevel);
}

and it was *ALWAYS* that. This is literally why log levels exist: they
say "some messages should not be printed, because it's slow and people
don't want to see it unless they need it".

So I think in a very real sense, the "solution" may be to make sure
our loglevel is something reasonable, and find the messages that cause
excessive noise, and make sure that they are buffered by being
sufficiently low log levels!

And maybe make sure our log level logic actually works right - I think
it should always have been per-console, but for obvious reasons that
was never the original behavior, and I suspect it was never fixed
because all the loglevel stuff I see from a quick grep is just
global).

A loglevel that makes sense for a fast directly connected console may
not make sense for a server with emulated serial lines and tons of
debug info.

IOW, I think a lot of the whole "long backlog" issues come from us
doing badly with log levels.

Linus