Re: [PATCH] kallsyms: fix nonconverging kallsyms table with lld

From: Guenter Roeck
Date: Wed Jun 09 2021 - 11:16:13 EST


On Wed, Jun 09, 2021 at 01:24:18PM +0200, Arnd Bergmann wrote:
> On Wed, Jun 9, 2021 at 1:05 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > On Thu, Feb 04, 2021 at 04:29:47PM +0100, Arnd Bergmann wrote:
> > > From: Arnd Bergmann <arnd@xxxxxxxx>
> > >
> > > ARM randconfig builds with lld sometimes show a build failure
> > > from kallsyms:
> > >
> > > Inconsistent kallsyms data
> > > Try make KALLSYMS_EXTRA_PASS=1 as a workaround
> > >
> > > The problem is the veneers/thunks getting added by the linker extend
> > > the symbol table, which in turn leads to more veneers being needed,
> > > so it may take a few extra iterations to converge.
> > >
> > > This bug has been fixed multiple times before, but comes back every time
> > > a new symbol name is used. lld uses a different set of idenitifiers from
> > > ld.bfd, so the additional ones need to be added as well.
> > >
> > > I looked through the sources and found that arm64 and mips define similar
> > > prefixes, so I'm adding those as well, aside from the ones I observed. I'm
> > > not sure about powerpc64, which seems to already be handled through a
> > > section match, but if it comes back, the "__long_branch_" and "__plt_"
> > > prefixes would have to get added as well.
> > >
> >
> > This is such a whack-a-mole. The problem is hitting us yet again. I suspect
> > it may be due to a new version of lld using new symbols, but I didn't really
> > try to track it down. Is there an easy way to search for missed symbols ?
>
> The way I did it previously was to hack Kbuild to not remove the temporary
> files after a failure, and then compare the "objdump --syms" output of the
> last two stages.

Problem with that is that we have a non-deterministic problem: The build
fails for us on some build servers, but we are unable to reproduce the
problem when building the same image manually on a development server.
That is similar to what I had observed before, where powerpc builds would
pass on one server, but the same kernel with the same configuration would
fail to build on a second almost identical server. It would really be great
if we can find a better solution.

>
> I suppose we could improve the situation if scripts/link-vmlinux.sh was able
> to do that automatically, and compare the kallsyms output .S file between
> steps 1 and 2.

Comparing the .S files doesn't result in useful data; turns out there are
always irrelevant differences. We'll try to run a diff on the output of
"objdump --syms". Hopefully that will generate something useful.

Thanks,
Guenter