Re: [PATCH 2/4] x86/boot: Remove runtime relocations from .head.text code

From: Arvind Sankar
Date: Sun May 24 2020 - 19:44:08 EST


On Sun, May 24, 2020 at 03:53:59PM -0700, Fangrui Song wrote:
> On 2020-05-24, Arvind Sankar wrote:
> >The assembly code in head_{32,64}.S, while meant to be
> >position-independent, generates run-time relocations because it uses
> >instructions such as
> > leal gdt(%edx), %eax
> >which make the assembler and linker think that the code is using %edx as
> >an index into gdt, and hence gdt needs to be relocated to its run-time
> >address.
> >
> >With the BFD linker, this generates a warning during the build:
> > LD arch/x86/boot/compressed/vmlinux
> >ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
> >ld: warning: creating a DT_TEXTREL in object
>
> Interesting. How does the build generate a warning by default?
> Do you use Gentoo Linux which appears to ship a --warn-shared-textrel
> enabled-by-default patch? (https://bugs.gentoo.org/700488)

Ah, yes I am using gentoo. I didn't realize that was a distro
modification.

> >+
> >+/*
> >+ * This macro gives the link address of X. It's the same as X, since startup_32
> >+ * has link address 0, but defining it this way tells the assembler/linker that
> >+ * we want the link address, and not the run-time address of X. This prevents
> >+ * the linker from creating a run-time relocation entry for this reference.
> >+ * The macro should be used as a displacement with a base register containing
> >+ * the run-time address of startup_32 [i.e. la(X)(%reg)], or as an
> >+ * immediate [$ la(X)].
> >+ *
> >+ * This macro can only be used from within the .head.text section, since the
> >+ * expression requires startup_32 to be in the same section as the code being
> >+ * assembled.
> >+ */
> >+#define la(X) ((X) - startup_32)
> >+
>
> IIRC, %ebp contains the address of startup_32. la(X) references X
> relative to startup_32. The fixup (in GNU as and clang integrated
> assembler's term) is a constant which is resolved by the assembler.
>
> There is no R_386_32 or R_386_PC32 for the linker to resolve.

This is incorrect (or maybe I'm not understanding you correctly). X is a
symbol whose final location relative to startup_32 is in most cases not
known to the assembler (there are a couple of cases where X is a label
within .head.text which do get completely resolved by the assembler).

For example, taking the instruction loading the gdt address, this is
what we get from the assembler:
lea la(gdt)(%ebp), %eax
becomes in the object file:
11: 8d 85 00 00 00 00 lea 0x0(%ebp),%eax
13: R_X86_64_PC32 .data+0x23
or a cleaner example using a global symbol:
subl la(image_offset)(%ebp), %ebx
becomes
41: 2b 9d 00 00 00 00 sub 0x0(%ebp),%ebx
43: R_X86_64_PC32 image_offset+0x43

So in general you get PC32 relocations, with the addend being set by the
assembler to .-startup_32, modulo the adjustment for where within the
instruction the displacement needs to be. These relocations are resolved
by the static linker to produce constants in the final executable.


>
> Not very sure stating that "since startup_32 has link address 0" is very
> appropriate here (probably because I did't see the term "link address"
> before). If my understanding above is correct, I think you can just
> reword the comment to express that X is referenced relative to
> startup_32, which is stored in %ebp.
>

Yeah, the more standard term is virtual address/VMA, but that sounds
confusing to me with PIE code since the _actual_ virtual address at
which this code is going to run isn't 0, that's just the address assumed
for linking. I can reword it to avoid referencing "link address" but
then it's not obvious why the macro is named "la" :) I'm open to
suggestions on a better name, I could use offset but that's a bit
long-winded. I could just use vma() if nobody else finds it confusing.

Thanks.