Re: [PATCH v6 02/20] modpost: fix section mismatch message for R_ARM_ABS32

From: Ard Biesheuvel
Date: Tue May 23 2023 - 03:13:28 EST


On Tue, 23 May 2023 at 07:08, Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
>
> On Tue, May 23, 2023 at 6:36 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > On Mon, 22 May 2023 at 19:56, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
> > >
> > > + linux-arm-kernel and some folks who might know another idea.
> > >
> > > On Sun, May 21, 2023 at 9:05 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
> > > >
> > > > addend_arm_rel() processes R_ARM_ABS32 in a wrong way.
> > > >
> > > > Here, simple test code.
> > > >
> > > > [test code 1]
> > > >
> > > > #include <linux/init.h>
> > > >
> > > > int __initdata foo;
> > > > int get_foo(int x) { return foo; }
> > > >
> > > > If you compile it with ARM versatile_defconfig, modpost will show the
> > > > symbol name, (unknown).
> > > >
> > > > WARNING: modpost: vmlinux.o: section mismatch in reference: get_foo (section: .text) -> (unknown) (section: .init.data)
> > > >
> > > > If you compile it for other architectures, modpost will show the correct
> > > > symbol name.
> > > >
> > > > WARNING: modpost: vmlinux.o: section mismatch in reference: get_foo (section: .text) -> foo (section: .init.data)
> > > >
> > > > For R_ARM_ABS32, addend_arm_rel() sets r->r_addend to a wrong value.
> > > >
> > > > I just mimicked the code in arch/arm/kernel/module.c.
> > > >
> > > > However, there is more difficulty for ARM.
> > > >
> > > > Here, test code.
> > > >
> > > > [test code 2]
> > > >
> > > > #include <linux/init.h>
> > > >
> > > > int __initdata foo;
> > > > int get_foo(int x) { return foo; }
> > > >
> > > > int __initdata bar;
> > > > int get_bar(int x) { return bar; }
> > > >
> > > > With this commit applied, modpost will show the following messages
> > > > for ARM versatile_defconfig:
> > > >
> > > > WARNING: modpost: vmlinux.o: section mismatch in reference: get_foo (section: .text) -> foo (section: .init.data)
> > > > WARNING: modpost: vmlinux.o: section mismatch in reference: get_bar (section: .text) -> foo (section: .init.data)
> > > >
> > > > The reference from 'get_bar' to 'foo' seems wrong.
> > > >
> > > > I have no solution for this because it is true in assembly level.
> > > >
> > > > In the following output, relocation at 0x1c is no longer associated
> > > > with 'bar'. The two relocation entries point to the same symbol, and
> > > > the offset to 'bar' is encoded in the instruction 'r0, [r3, #4]'.
> > > >
> >
> > These are section relative relocations - this is unusual but not
> > incorrect. Normally, you only see this if the symbols in question have
> > static linkage.
>
>
> I noticed this usually happens in reference to 'static',
> but on ARM, it happens even without 'static'.
> See the [test code 1].
>
>
> > It does mean that the symbol is not preemptible, which is what makes
> > this somewhat surprising.
> >
> > Generally, you cannot resolve a relocation to a symbol without taking
> > the addend into account, so looking up the address of .init.data in
> > the symbol table is not quite the right approach here. If anything,
> > the symbol should be reported as [.init.data+0x4] in the second case.
>
>
> In the old days, section mismatch warnings showed
> only the referenced section name.
>
> Since [1], modpost started to show the referenced symbol name too.
> Modpost did it in the past 17 years.
> It sometimes shows a wrong name, but works in most architectures.
> Unfortunately, I noticed ARM was an unfortunate case.
>
> Do you suggest removing it entirely?
>

No, not at all. But resolving the symbol should take the addend into
account, and this is essentially what you are doing in your patch.

The point is really that the relocation in question does not refer to
the symbol - it refers to a section+offset that we /think/ corresponds
with a certain symbol. But for example, if the symbol is weak and
another definition exists, the section based relocation will refer to
one version, and a relocation that references the symbol name will
refer to the other version.


>
> If (elf->symtab_start + ELF_R_SYM(r.r_info)) has a sensible
> symbol name, print it. Otherwise, print only the section name.
> Is this what you mean?
>
> That means, we will lose the symbol name info of 'static'
> (and even global symbols on ARM)
>
>
> That is what I wrote in the commit description.
>
> "I am keeping the current logic because it is useful in many architectures,
> but the symbol name is not always correct depending on the optimization
> of the relocation. I left some comments in find_tosym()."
>

Fair enough.