Re: [PATCH 5/7] printk: Make %pS and friends print module build ID

From: Stephen Boyd
Date: Wed Mar 03 2021 - 20:09:39 EST


Quoting Andy Shevchenko (2021-03-03 08:17:01)
> On Wed, Mar 03, 2021 at 10:00:12AM -0500, Steven Rostedt wrote:
> > On Wed, 3 Mar 2021 11:25:58 +0100
> > Petr Mladek <pmladek@xxxxxxxx> wrote:
> >
> > > Alternative solution would be to minimize the information, for
> > > example, by printing only the modules that appear in the backtrace.
> > > But this might be complicated to implement.
> >
> > It could be a list after the backtrace perhaps, and not part of the
> > "modules linked in"?
> >
> > But then you need a generic way of capturing those modules in the backtrace
> > that works for every architecture.

Right, and doing that is sort of complicated for something that really
shouldn't need to be complicated. We're printing out information about a
crash/hang/bug and that should be fast and not too computationally
intensive so that the stacktrace can be printed before everything starts
falling apart. I'd rather not save things away while processing the
stacktrace and then print more info after the stacktrace. Seems fragile.

>
> > Honestly, I don't even know what a buildid is, and it is totally useless
> > information for myself. What exactly is it used for?
>
> Dunno Stephen's motivation, but build ID is very useful when you do tracing,
> then based on ID the decoders can know what exactly was the layout of the
> binary and list of (exported) functions, etc.
>
> At least that was my (shallow) experience with perf last time I have tried it.
>

I'm starting to feel like nobody read the commit text, or I messed up
somehow and the commit text was confusing? :(

│ This is especially helpful for crash debugging with pstore or crashdump
│ kernels. If we have the build ID for the module in the stacktrace we can
│ request the debug symbols for the module from a remote debuginfod server
│ or parse stacktraces at a later time with decode_stacktrace.sh by
│ downloading the correct symbols based on the build ID. This cuts down on
│ the amount of time and effort needed to find the correct kernel modules
│ for a stacktrace by encoding that information into it.

In some distro (read: non-kernel dev) workflows the vmlinux isn't
shipped on the device and crash handling is done offline or much later.
Using the build ID[1] is a common way to identify the binary that's
running on the device. In conjunction with a debuginfod[2] server you
can download the symbols for a crash automatically if you have the build
ID information.

I can add a patch that updates decode_stacktrace.sh to show how it can
download the correct vmlinux/modules if it isn't provided on the
commandline.

If the debug symbols are on some public server then in theory we could
have some robot sitting on the mailing list that looks for stacktraces
and automatically replies with information about the line number/file
and even provides the code snippet for the code that's crashing from
that binary, because it's all stored in the full debuginfo builds.

[1] https://fedoraproject.org/wiki/RolandMcGrath/BuildID
[2] https://sourceware.org/elfutils/Debuginfod.html