Re: [PATCH v6 03/20] modpost: detect section mismatch for R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS

From: Ard Biesheuvel
Date: Mon May 22 2023 - 17:50:37 EST


On Mon, 22 May 2023 at 20:03, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
>
> + linux-arm-kernel
>
> On Sun, May 21, 2023 at 9:05 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
> >
> > ARM defconfig misses to detect some section mismatches.
> >
> > [test code]
> >
> > #include <linux/init.h>
> >
> > int __initdata foo;
> > int get_foo(int x) { return foo; }
> >
> > It is apparently a bad reference, but modpost does not report anything
> > for ARM defconfig (i.e. multi_v7_defconfig).
> >
> > The test code above produces the following relocations.
> >
> > Relocation section '.rel.text' at offset 0x200 contains 2 entries:
> > Offset Info Type Sym.Value Sym. Name
> > 00000000 0000062b R_ARM_MOVW_ABS_NC 00000000 .LANCHOR0
> > 00000004 0000062c R_ARM_MOVT_ABS 00000000 .LANCHOR0
> >
> > Relocation section '.rel.ARM.exidx' at offset 0x210 contains 2 entries:
> > Offset Info Type Sym.Value Sym. Name
> > 00000000 0000022a R_ARM_PREL31 00000000 .text
> > 00000000 00001000 R_ARM_NONE 00000000 __aeabi_unwind_cpp_pr0
> >
> > Currently, R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS are just skipped.
> >
> > Add code to handle them. I checked arch/arm/kernel/module.c to learn
> > how the offset is encoded in the instruction.
> >
> > The referenced symbol in relocation might be a local anchor.
> > If is_valid_name() returns false, let's search for a better symbol name.
> >
> > Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
> > ---
> >
> > scripts/mod/modpost.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> > index 34fbbd85bfde..ed2301e951a9 100644
> > --- a/scripts/mod/modpost.c
> > +++ b/scripts/mod/modpost.c
> > @@ -1108,7 +1108,7 @@ static inline int is_valid_name(struct elf_info *elf, Elf_Sym *sym)
> > /**
> > * Find symbol based on relocation record info.
> > * In some cases the symbol supplied is a valid symbol so
> > - * return refsym. If st_name != 0 we assume this is a valid symbol.
> > + * return refsym. If is_valid_name() == true, we assume this is a valid symbol.
> > * In other cases the symbol needs to be looked up in the symbol table
> > * based on section and address.
> > * **/
> > @@ -1121,7 +1121,7 @@ static Elf_Sym *find_tosym(struct elf_info *elf, Elf64_Sword addr,
> > Elf64_Sword d;
> > unsigned int relsym_secindex;
> >
> > - if (relsym->st_name != 0)
> > + if (is_valid_name(elf, relsym))
> > return relsym;
> >
> > /*
> > @@ -1312,11 +1312,19 @@ static int addend_arm_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
> > unsigned int r_typ = ELF_R_TYPE(r->r_info);
> > Elf_Sym *sym = elf->symtab_start + ELF_R_SYM(r->r_info);
> > unsigned int inst = TO_NATIVE(*reloc_location(elf, sechdr, r));
> > + int offset;
> >
> > switch (r_typ) {
> > case R_ARM_ABS32:
> > r->r_addend = inst + sym->st_value;
> > break;
> > + case R_ARM_MOVW_ABS_NC:
> > + case R_ARM_MOVT_ABS:
> > + offset = ((inst & 0xf0000) >> 4) | (inst & 0xfff);
> > + offset = (offset ^ 0x8000) - 0x8000;
>
> The code in arch/arm/kernel/module.c then right shifts the offset by
> 16 for R_ARM_MOVT_ABS. Is that necessary?
>

MOVW/MOVT pairs are limited to an addend of -/+ 32 KiB, and the same
value must be encoded in both instructions.

When constructing the actual immediate value from the symbol value and
the addend, only the top 16 bits are used in MOVT and the bottom 16
bits in MOVW.

However, this code seems to borrow the Elf_Rela::addend field (which
ARM does not use natively) to record the intermediate value, which
would need to be split if it is used to fix up instruction opcodes.

Btw the Thumb2 encodings of MOVT and MOVW seem to be missing here.


> > + offset += sym->st_value;
> > + r->r_addend = offset;
> > + break;
> > case R_ARM_PC24:
> > case R_ARM_CALL:
> > case R_ARM_JUMP24:
> > --
> > 2.39.2
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers