Re: [PATCH v6 09/11] printk: Add sysctl interface to set global loglevels

From: Petr Mladek
Date: Thu Nov 14 2024 - 11:26:10 EST


On Mon 2024-10-28 16:45:55, Chris Down wrote:
> Introduce two new sysctl interfaces for configuring global loglevels:
>
> - kernel.console_loglevel: Sets the global console loglevel, determining
> the minimum priority of messages printed to consoles. Messages with a
> loglevel lower than this value will be printed.
> - kernel.default_message_loglevel: Sets the default loglevel for
> messages that do not specify an explicit loglevel.
>
> The kernel.printk sysctl was previously used to set multiple loglevel
> parameters simultaneously, but it was confusing and lacked proper
> validation. By introducing these dedicated sysctl interfaces, we provide
> a clearer and more granular way to configure the loglevels.
>
> Signed-off-by: Chris Down <chris@xxxxxxxxxxxxxx>
> ---
> Documentation/admin-guide/sysctl/kernel.rst | 17 ++++++++-
> kernel/printk/sysctl.c | 42 +++++++++++++++++++++
> 2 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index f8bc1630eba0..8019779b27f6 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -1044,6 +1044,20 @@ otherwise the 'doze' mode will be used.
>
> ==============================================================
>
> +Some of these settings may be overridden per-console, see
> +Documentation/admin-guide/per-console-loglevel.rst. See ``man 2 syslog`` for
> +more information on the different loglevels.
> +
> +console_loglevel
> +================
> +
> +Messages with a higher priority than this will be printed to consoles.
> +
> +default_message_loglevel
> +========================
> +
> +Messages without an explicit priority will be printed with this priority.
> +
> printk
> ======
>
> @@ -1052,8 +1066,7 @@ The four values in printk denote: ``console_loglevel``,
> ``default_console_loglevel`` respectively.
>
> These values influence printk() behavior when printing or
> -logging error messages. See '``man 2 syslog``' for more info on
> -the different loglevels.
> +logging error messages.
>
> ======================== =====================================
> console_loglevel messages with a higher priority than
> diff --git a/kernel/printk/sysctl.c b/kernel/printk/sysctl.c
> index f5072dc85f7a..3bce8b89dacc 100644
> --- a/kernel/printk/sysctl.c
> +++ b/kernel/printk/sysctl.c
> @@ -11,6 +11,9 @@
>
> static const int ten_thousand = 10000;
>
> +static int min_msg_loglevel = LOGLEVEL_EMERG;
> +static int max_msg_loglevel = LOGLEVEL_DEBUG;
> +
> static int proc_dointvec_minmax_sysadmin(const struct ctl_table *table, int write,
> void *buffer, size_t *lenp, loff_t *ppos)
> {
> @@ -20,6 +23,29 @@ static int proc_dointvec_minmax_sysadmin(const struct ctl_table *table, int writ
> return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> }
>
> +static int printk_console_loglevel(const struct ctl_table *table, int write,
> + void *buffer, size_t *lenp, loff_t *ppos)

I would make it more clear that this function is using the procfs
based API and call it "proc_dointvec_console_loglevel()".

> +{
> + struct ctl_table ltable = *table;
> + int ret, level;
> +
> + if (!write)
> + return proc_dointvec(table, write, buffer, lenp, ppos);
> +
> + ltable.data = &level;

Ah, I have missed that this is a copy and spent quite some time
wondering why it worked ;-) I remember that the same happened
last time I saw this trick.

It would deserve a comment for people like me. Or maybe, rename
the variable from ltable to table_copy.

Or I think about another solution, see below.

> +
> + ret = proc_dointvec(&ltable, write, buffer, lenp, ppos);
> + if (ret)
> + return ret;
> +
> + if (level != -1 && level != clamp_loglevel(level))
> + return -ERANGE;
> +
> + console_loglevel = level;

There is no locking. It seems that the original proc_dointvec code
handle this by using WRITE_ATOMIC(). It prevents compiler
optimizations. In particular, it makes sure that the entire value
will be updated in a single operation (atomically).

> + return 0;
> +}
> +

I have mixed feelings. On one hand, the copy of the table entry looks
like a nice trick. On the other hand, I guess that this is the only
code using such a trick. It might make it more error prone when
some of the API internals change.

It seems that other users handle similar situations by
passing a custom @conv callback to do_proc_dointvec(),
for example, proc_dointvec_minmax(), proc_dointvec_jiffies().

This approach would require exporing do_proc_dointvec()
and do_proc_dointvec_conv(). But there already is a precedent
when do_proc_douintvec() is used in proc_dopipe_max_size().

I have tried to implement it to see how it looks. And I would
prefer to use it. Here is the updated patch: