Re: [PATCH v6 03/11] printk: console: Implement core per-console loglevel infrastructure

From: Petr Mladek
Date: Fri Nov 08 2024 - 11:10:40 EST


On Mon 2024-10-28 16:45:37, Chris Down wrote:
> Consoles can have vastly different latencies and throughputs. For
> example, writing a message to the serial console can take on the order
> of tens of milliseconds to get the UART to successfully write a message.
> While this might be fine for a single, one-off message, this can cause
> significant application-level stalls in situations where the kernel
> writes large amounts of information to the console.
>
> This means that while you might want to send at least INFO level
> messages to (for example) netconsole, which is relatively fast, you may
> only want to send at least WARN level messages to the serial console.
> Such an implementation would permit debugging using the serial console
> in cases that netconsole doesn't receive messages during particularly
> bad system issues, while still keeping the noise low enough to avoid
> inducing latency in userspace applications. To mitigate this, add such
> an interface, extending the existing console loglevel controls to allow
> each console to have its own loglevel.
>
> One can't just disable the serial console, because one may actually need
> it in situations where the machine is in a bad enough state that nothing
> is received on netconsole. One also can't just bump the loglevel at
> runtime after the issue, because usually the machine is already so
> wedged by this point that it isn't responsive to such requests.

The motivation is described nicely.

> The sysfs and kernel command line interfaces to set the per-console
> loglevel will be added later. For now, simply add the necessary internal
> infrastructure to be used by later patches.

It would be nice to describe the basic logic in the commit message.
And also the handling when the global console_loglevel is modified
via sysrq. Something like:

<proposal>
This commit adds the internal infrastructure to support per-console
log levels, which will be configurable through sysfs and the kernel
command line in future commits.

The global `console_loglevel` is preserved and used as the default log
level for all consoles. Each console can override this global level
with its own specific log level stored in `struct console`. To
override the global level, the per-console log level must be greater
than 0; otherwise, the default value of -1 ensures the global level
is used.

The existing `ignore_loglevel` command line parameter will override
both the global and per-console log levels.
</proposal>

> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -101,11 +102,25 @@ __setup("sysrq_always_enabled", sysrq_always_enabled_setup);
> static void sysrq_handle_loglevel(u8 key)
> {
> u8 loglevel = key - '0';
> + int cookie;
> + int warned = 0;
> + struct console *con;
>
> console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
> pr_info("Loglevel set to %u\n", loglevel);
> console_loglevel = loglevel;
> +
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(con) {
> + if (!warned && per_console_loglevel_is_set(con)) {
> + warned = 1;
> + pr_warn("Overriding per-console loglevel from sysrq\n");
> + }
> + WRITE_ONCE(con->level, -1);
> + }
> + console_srcu_read_unlock(cookie);
> }
> +

I would prefer to move this into a separate patch. It is similar to
the syslog interface which is handled separately as well.

I would keep this patch only for the basic logic.


> static const struct sysrq_key_op sysrq_loglevel_op = {
> .handler = sysrq_handle_loglevel,
> .help_msg = "loglevel(0-9)",
> diff --git a/include/linux/console.h b/include/linux/console.h
> index eba367bf605d..3ff22bfeef2a 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -349,6 +350,7 @@ struct console {
> unsigned long dropped;
> void *data;
> struct hlist_node node;
> + int level;
>
> /* nbcon console specific members */
>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index eca9bb2ee637..5fbd6b7f1ab4 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -303,6 +304,10 @@ static inline void nbcon_device_release(struct console *con)
> static inline void nbcon_atomic_flush_unsafe(void)
> {
> }
> +static inline bool per_console_loglevel_is_set(const struct console *con)

There are similar functions in the printk API, for example,
is_printk_legacy_deferred(). It would be nice to use a similar
naming scheme. I would call it, something like:

has_per_console_loglevel(con)

> +{
> + return false;
> +}
>
> #endif
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1287,9 +1287,62 @@ module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR);
> MODULE_PARM_DESC(ignore_loglevel,
> "ignore loglevel setting (prints all kernel messages to the console)");
>
> -static bool suppress_message_printing(int level)
> +bool per_console_loglevel_is_set(const struct console *con)
> {
> - return (level >= console_loglevel && !ignore_loglevel);
> + return con && (READ_ONCE(con->level) > 0);

We should document why READ_ONCE() is used.

Also it would be great to allow checking of locking dependencies to
make sure that the struct console could not disappear. Similiar
to console_srcu_read_flags(con).

In the followup patches, the funtion is mostly used
the console_srcu_read_lock(). The exception seems to be the sysfs
interface, e.g. effective_loglevel_source_show().

Hmm, we need to distinguish these two callers either by serapate API
or a parameter.

Both is a bit ugly, so I would prefer to reduce the number of these
ugly APIs by passing @con_loglevel instead of @con.

Anyway, we should create something similar to console_srcu_read_flags().


/**
* console_srcu_read_loglevel - Locklessly read the console specific loglevel
* of a possibly registered console
* @con: struct console pointer of console to read flags from
*
* Locklessly reading @con->loglevel provides a consistent read value because
* there is at most one CPU modifying @con->loglevel and that CPU is using only
* read-modify-write operations to do so.
*
* Requires console_srcu_read_lock to be held, which implies that @con might
* be a registered console. The purpose of holding console_srcu_read_lock is
* to guarantee that the console state is valid (CON_SUSPENDED/CON_ENABLED)
* and that no exit/cleanup routines will run if the console is currently
* undergoing unregistration.
*
* If the caller is holding the console_list_lock or it is _certain_ that
* @con is not and will not become registered, the caller may read
* @con->loglevel directly instead.
*
* Context: Any context.
* Return: The current value of the @con->level field.
*/
static inline int console_srcu_read_loglevel(const struct console *con)
{
WARN_ON_ONCE(!console_srcu_read_lock_is_held());

/* The READ_ONCE() matches the WRITE_ONCE() when @flags are modified. */
return data_race(READ_ONCE(con->level));
}

Then I would suggest to pass @con_loglevel directly in most the other
API. For example, this function would be:

bool is_valid_per_console_loglevel(int con_level)
{
return (con_level > 0);
}

> +}
> +
> +/*
> + * Hierarchy of loglevel authority:
> + *
> + * 1. con->level. The locally set, console-specific loglevel. Optional, only
> + * valid if >0.
> + * 2. console_loglevel. The default global console loglevel, always present.

I would remove these comments. It is obvious from the code.

> + */
> +enum loglevel_source
> +console_effective_loglevel_source(const struct console *con)
> +{
> + if (WARN_ON_ONCE(!con))
> + return LLS_GLOBAL;
> +
> + if (ignore_loglevel)
> + return LLS_IGNORE_LOGLEVEL;
> +
> + if (per_console_loglevel_is_set(con))
> + return LLS_LOCAL;
> +
> + return LLS_GLOBAL;
> +}

The @con parameter is nice. But it makes it hard to add lockdep
checks. So, I suggest to pass the console specific loglevel
as the parameter, like:

enum loglevel_source
console_effective_loglevel_source(int con_level)
{
if (ignore_loglevel)
return LLS_IGNORE_LOGLEVEL;

if (is_valid_per_console_loglevel(con_level))
return LLS_LOCAL;

return LLS_GLOBAL;
}

IMHO, it is not that bad after all.


> +int console_effective_loglevel(const struct console *con)
> +{
> + enum loglevel_source source;
> + int level;
> +
> + source = console_effective_loglevel_source(con);
> +
> + switch (source) {
> + case LLS_IGNORE_LOGLEVEL:
> + level = CONSOLE_LOGLEVEL_MOTORMOUTH;
> + break;
> + case LLS_LOCAL:
> + level = READ_ONCE(con->level);
> + break;
> + case LLS_GLOBAL:
> + level = console_loglevel;
> + break;
> + default:
> + pr_warn("Unhandled console loglevel source: %d", source);
> + level = console_loglevel;
> + break;
> + }
> +
> + return level;
> +}

and similar here:

int console_effective_loglevel(int con_level)
{
enum loglevel_source source;
int level;

source = console_effective_loglevel_source(con_level);

switch (source) {
case LLS_IGNORE_LOGLEVEL:
level = CONSOLE_LOGLEVEL_MOTORMOUTH;
break;
case LLS_LOCAL:
level = con_level;
break;
case LLS_GLOBAL:
level = console_level;
break;
default:
pr_warn("Unhandled console loglevel source: %d", source);
level = console_level;
break;
}

return level;
}


> +
> +static bool suppress_message_printing(int level, struct console *con)
> +{
> + return level >= console_effective_loglevel(con);
> }

and here:

static bool suppress_message_printing(int level, int con_level)
{
return level >= console_effective_loglevel(con_level);
}

>
> #ifdef CONFIG_BOOT_PRINTK_DELAY
> @@ -2122,7 +2175,21 @@ int printk_delay_msec __read_mostly;
>
> static inline void printk_delay(int level)
> {
> - if (suppress_message_printing(level))
> + bool will_emit = false;
> + int cookie;
> + struct console *con;
> +
> + cookie = console_srcu_read_lock();
> +
> + for_each_console_srcu(con) {
> + if (!suppress_message_printing(level, con)) {

This is called under srcu read lock so that we could use the srcu_read
API here:

for_each_console_srcu(con) {
int con_level = console_srcu_read_loglevel(con);

if (!suppress_message_printing(level, con_level)) {
...

> + will_emit = true;
> + break;
> + }
> + }
> + console_srcu_read_unlock(cookie);
> +
> + if (!will_emit)
> return;
>
> boot_delay_msec();
> @@ -2975,7 +3042,7 @@ bool printk_get_next_message(struct printk_message *pmsg, struct console *con,
> pmsg->dropped = r.info->seq - seq;
>
> /* Never suppress when used in devkmsg_read() */
> - if (con && suppress_message_printing(r.info->level))
> + if (con && suppress_message_printing(r.info->level, con))

We could read the con_level at the beginning of the funcion. We
already read there the CON_EXTENDED attribute:

int con_level:

if (con) {
is_extended = console_srcu_read_flags(con) & CON_EXTENDED;
con_level = console_srcu_read_loglevel(con);
} else {
/* Used only by devkmsg_read(). */
is_extended = true;
con_level = LOGLEVEL_DEFAULT;
}

Note that I have used LOGLEVEL_DEFAULT instead of the hardcoded -1.
IMHO, it is better to use a NAME and LOGLEVEL_DEFAULT is defined as -1
and the name more or less fits here.

and then do:

if (con && suppress_message_printing(r.info->level, con_level))



> goto out;
>
> if (is_extended) {
> @@ -3789,6 +3856,9 @@ static int try_enable_preferred_console(struct console *newcon,
> if (newcon->index < 0)
> newcon->index = c->index;
>
> + // TODO: Will be configurable in a later patch

I would remove the comment. It is enough to mention this in the commit message.

> + newcon->level = -1;

newcon->level = LOGLEVEL_DEFAULT;

> +
> if (_braille_register_console(newcon, c))
> return 0;
>


Best Regards,
Petr

PS: I am sorry for suggesting so many changes. I think that you did it as
we agreed in the discussion about v5.

v5 was discussed a long time ago and I have learned some new
tricks with lockdep when working on the printk kthreads.