Re: [RFC PATCH 1/2] x86/relocs: Improve diagnostic for rejected absolute references

From: Ingo Molnar
Date: Sat Feb 22 2025 - 07:01:32 EST



* Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:

> On Mon, 3 Feb 2025 at 10:40, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> >
> > * Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > > On Mon, 27 Jan 2025 at 17:57, Linus Torvalds
> > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Mon, 27 Jan 2025 at 03:43, Ard Biesheuvel <ardb+git@xxxxxxxxxx> wrote:
> > > > >
> > > > > Absolute reference to symbol '.rodata+0x180' detected in .head.text (0xffffffff820cb4ba).
> > > >
> > > > Do we have any symbol name lookup logic anywhere?
> > > >
> > >
> > > I can look into that. In this particular case, though, there is no
> > > symbol to look up as it is a anonymous jump table generated by the
> > > compiler. And the function name would be inaccurate too, as
> > > snp_cpuid_postprocess() got inlined into its caller. But I guess with
> > > the right DWARF data, at least the call site could be narrowed down a
> > > bit better.
> >
> > So patch #2 is now upstream, but should I apply this diagnostic patch
> > as-is, or will there be a -v2?
> >
>
> I'm looking into this. But give the points above, I'm reaching the
> conclusion that producing a better diagnostic based solely on vmlinux
> (which may be built without debug info) is intractable, and not even
> the DWARF metadata will describe a compiler generated jump table using
> a named ELF symbol.
>
> So I am also looking into isolating the startup code like I did for
> arm64 (and which has been adopted by RISC-V as well), but this is
> rather hairy on x86 so it will take some time. But once that lands,
> this diagnostic can be removed.
>
> So I will leave it up to you to decide whether to merge this
> improvement for now, or revert the diagnostic as you suggested before.
> This code has already identified some issues that were subsequently
> fixed, so it has already served its purpose.

So after another 2 weeks there's been no new upstream regressions I'm
aware of, so - knock on wood - it seems we can leave the die() in
place?

But could we perhaps make it more debuggable, should it trigger - such
as not removing the relevant object file and improving the message?
I.e. make the build failure experience Linus had somewhat more
palatable...

Thanks,

Ingo