Re: [PATCH v6 3/4] printk: Userspace format indexing support

From: Petr Mladek
Date: Fri Jun 04 2021 - 06:19:16 EST


On Tue 2021-06-01 16:15:50, Chris Down wrote:
> Hey Petr,
>
> Thanks for the feedback -- apologies for the delay in getting back.

No problem at all.

> Petr Mladek writes:
> > Note that we do not store the value defined by pr_fmt() which
> > is yet another automatically added prefix.
>
> Hmm, we do actually store pr_fmt, since it's known at compile time (or we
> wouldn't store it at all if it was dynamic due to the __builtin_constant_p
> check). Or are you seeing something different?

I see. Well, the prefix defined by pr_fmt() becomes part of entry.fmt.
It is _not_ stored in entry.pre_fmt which confused me.

As I mentioned in the previous mail. I think about using "subsys_fmt"
or "subsys_prefix" instead of "pre_fmt".

Anyway, we should improve the comment. For example, something like:

/*
* The format string used by various subsystem specific printk()
* wrappers to prefix the message.
*
* Note that the static prefix defined by pr_fmt() macro is handled
* another way. It is stored directly in the message format (@fmt).
*/
const char *subsys_fmt;


> > I would personally omit these prefixes. The most important
>
> Having this information means that system monitors know that they can anchor
> the matching at both ends when linking a printk format with some output,
> rather than having to look for a substring (in which case things become more
> difficult to determine).
>
> I agree that in this particular case with dev_printk it's not incredibly
> useful, but there are certainly others we may want to have in future (like
> the btrfs case) where it would certainly help disambiguate. :-)

Fair enough.

> > > +} __packed;
> > > +
> > > +#define __printk_index_emit(_fmt, _level, _pre_fmt, _post_fmt) \
> > > + ({ \
> > > + if (__builtin_constant_p(_fmt) && __builtin_constant_p(_level)) { \
> > > + /*
> > > + * The compiler may not be able to eliminate the
> > > + * non-constant variants of _fmt and _level, so we need
> > > + * to make sure that it doesn't see any hypothetical
> > > + * assignment for non-constants even though this is
> > > + * already inside the __builtin_constant_p guard.
> > > + */ \
> > > + static const struct pi_entry _entry \
> > > + __used = { \
> > > + .fmt = __builtin_constant_p(_fmt) ? (_fmt) : NULL, \
> > > + .func = __func__, \
> > > + .file = __FILE__, \
> > > + .line = __LINE__, \
> > > + .level = __builtin_constant_p(_level) ? (_level) : NULL, \
> > > + .pre_fmt = _pre_fmt, \
> >
> > Should this also be?
> >
> > .pre_fmt = __builtin_constant_p(_pre_fmt) ? _pre_fmt : NULL
> >
>
> Hmm, if pre_fmt for a subsystem isn't known at compile time, it's an error
> on the developer's end -- otherwise things can't be stored anyway and
> there's no point.

I see.

> > > + * pre and post must be known at compile time, or compilation will fail (since
> > > + * this is a mistake). If fmt or level is not known at compile time, no index
> > > + * entry will be made (since this can legitimately happen).
> >
> > How is this achieved, please?
> >
> > __printk_index_emit() creates the entry when the following check is true:
> >
> > if (__builtin_constant_p(_fmt) && __builtin_constant_p(_level))
> >
> > It checks neither _pre_fmt nor _post_fmt.
>
> Maybe my phrasing wasn't clear, but compilation would fail in that case
> because `.pre_fmt = _pre_fmt` will be a non-constant static initialiser, no?

I see. It was not obvious to me.


> > > + if (!entry->fmt)
> > > + return 0;
> >
> > Is this just a paranoid check or is it a valid case?
>
> It happens only for the following printk:
>
> arch/x86/mm/fault.c:534 show_fault_oops "BUG: kernel NULL pointer dereference, address: %px\n"

I actually see the above format in the index/vmlinux. So, I added some
debugging info ad the problem seems to be with

[ 94.840349] Err: Printk index entry without format defined
[ 94.841160] arch/x86/mm/fault.c:435, is_errata93

The string is defined as:

static const char errata93_warning[] =
KERN_ERR
"******* Your BIOS seems to not contain a fix for K8 errata #93\n"
"******* Working around it, but it may cause SEGVs or burn power.\n"
"******* Please consider a BIOS update.\n"
"******* Disabling USB legacy in the BIOS may also help.\n";

And it is used:
printk_once(errata93_warning);

Strange. I do not understand what is so special about it. The string
is constant so it should not be the problem handled by the double
__builtin_constant_p.

Anyway, we should keep the check for entry->fmt == NULL to be
on the same side. And it is likely needed also in situations
that are handled by the double __builtin_constant_p.


> > Is is possible to update __printk_index_emit() to do not create
> > entries with fmt = NULL ?
> >
> > We should either remove the above check or add a comment
> > explaining why it is necessary.
>
> The trick is unfortunately necessary for the reason described in the comment
> above the double check:
>
> /*
> * The compiler may not be able to eliminate the
> * non-constant variants of _fmt and _level, so we need
> * to make sure that it doesn't see any hypothetical
> * assignment for non-constants even though this is
> * already inside the __builtin_constant_p guard.
> */
>
> Happy to add another comment if it helps.

I re-read the discussion about this,
namely https://lore.kernel.org/r/3Kf896Zt9O+/Yh@xxxxxxxxxxxxxx

Frankly, the comment above ftrace_vprintk() is easier to understand
for me. I mean:

/*
* The double __builtin_constant_p is because gcc will give us an error
* if we try to allocate the static variable to fmt if it is not a
* constant. Even with the outer if statement.
*/

> > > +
> > > + if (entry->level)
> > > + printk_parse_prefix(entry->level, &level, &flags);
> > > + else
> > > + prefix_len = printk_parse_prefix(entry->fmt, &level, &flags);
> >
> > This is missing:
> >
> > if (level == LOGLEVEL_DEFAULT)
> > level = default_message_loglevel;
> >
> > Without it, it produces lines with loglevel <-1>, for example:
> >
> > <-1> init/do_mounts.c:457 mount_block_root "\n"
> > <-1> init/do_mounts.c:456 mount_block_root " %s"
> > <-1> init/do_mounts.c:454 mount_block_root "No filesystem could mount root, tried: "
>
> Hmm, from the perspective of someone building this printk-checking
> infrastructure, I want -1 here instead of dynamically changing the output to
> default_message_loglevel, because default_message_loglevel may not be the
> same across systems. For example, having "-1" allows people to know these
> will come out at whatever the default message loglevel is, rather than
> necessarily being always at (say) level 4 or whatever it happens to be set
> at on the running system.
>
> Without that, it's also not possible to compare entries across systems or
> across time, because the default loglevel may dynamically change at runtime,
> and it's not possible to disambiguate "will always be KERN_X" from "happens
> to be KERN_X" on this system.
>
> And if someone really does want to know what value it will be right now,
> they can just look at /proc/sys/kernel/printk and change it out as desired.

Fair enough.

Best Regards,
Petr