Re: [PATCH 2/5] selftests/sgx: Fix function pointer relocation in test enclave.
From: Huang, Kai
Date: Fri Aug 18 2023 - 08:55:21 EST
On Mon, 2023-08-07 at 09:13 +0200, Jo Van Bulck wrote:
> On 03.08.23 05:58, Huang, Kai wrote:
> > Putting aside whether we should consider building the selftests using "-Os", it
> > would be helpful to explain how can the "-Os" break the existing code, so that
> > we can review why this fix is reasonable. Perhaps it's obvious to others but
> > it's not obvious to me what can go wrong here.
>
> I dug deeper into this and the real problem here is that the enclave ELF
> is linked with -static but also needs to be relocatable, which, in my
> understanding, is not what -static is meant for (i.e., the man pages
> says -static overrides -pie). Particularly, with -static I noticed that
> global variables are hard-coded assuming the ELF is loaded at base
> address zero.
>
> When reading more on this, it seems like the proper thing to do for
> static and relocatable binaries is to compile with -static-pie, which is
> added since gcc 8 [1] (and similarly supported by clang).
Thanks for analysing!
>
> As a case in point, to hopefully make this clearer, consider the
> following C function:
>
> extern uint8_t __enclave_base;
> void *get_base(void) {
> return &__enclave_base;
> }
>
> Compiling with -static and -fPIC hard codes the __enclave_base symbol to
> zero (the start of the ELF enclave image):
>
> 00000000000023f4 <get_base>:
> 23f4: f3 0f 1e fa endbr64
> 23f8: 55 push %rbp
> 23f9: 48 89 e5 mov %rsp,%rbp
> 23fc: 48 c7 c0 00 00 00 00 mov $0x0,%rax
> 2403: 5d pop %rbp
> 2404: c3 ret
>
> Compiling with -static-pie and -fPIE properly emits a RIP-relative address:
>
> 00000000000023f4 <get_base>:
> 23f4: f3 0f 1e fa endbr64
> 23f8: 55 push %rbp
> 23f9: 48 89 e5 mov %rsp,%rbp
> 23fc: 48 8d 05 fd db ff ff lea -0x2403(%rip),%rax # 0
> <__enclave_base>
> 2403: 5d pop %rbp
> 2404: c3 ret
>
> Now, the fact that it currently *happens* to work is a mere coincidence
> of how the local encl_op_array initialization is compiled without
> optimizations with -static -fPIC:
>
> 00000000000023f4 <encl_body>:
> /* snipped */
> 2408: 48 8d 05 ec fe ff ff lea -0x114(%rip),%rax
> # 22fb <do_encl_op_put_to_buf>
> 240f: 48 89 45 b0 mov %rax,-0x50(%rbp)
> 2413: 48 8d 05 18 ff ff ff lea -0xe8(%rip),%rax
> # 2332 <do_encl_op_get_from_buf>
> 241a: 48 89 45 b8 mov %rax,-0x48(%rbp)
> 241e: 48 8d 05 44 ff ff ff lea -0xbc(%rip),%rax
> # 2369 <do_encl_op_put_to_addr>
> /* snipped */
>
> When compiling with optimizations with -static -fPIC -Os, encl_op_array
> is instead initialized with a prepared copy from .data:
>
> 00000000000021b5 <encl_body>:
> /* snipped */
> 21bc: 48 8d 35 3d 2e 00 00 lea 0x2e3d(%rip),%rsi
> # 5000 <encl_buffer+0x2000>
> 21c3: 48 8d 7c 24 b8 lea -0x48(%rsp),%rdi
> 21c8: b9 10 00 00 00 mov $0x10,%ecx
> 21cd: f3 a5 rep movsl %ds:(%rsi),%es:(%rdi)
> /* snipped */
How is the "prepared copy" prepared, exactly? Could you paste the relevant code
here too? IMHO w/o it it's hard to tell whether the code could be wrong or not
after relocating.
>
> Thus, in this optimized code, encl_op_array will have function pointers
> that are *not* relocated. The compilation assumes the -static binary has
> base address zero and is not relocatable:
>
> $ readelf -r test_encl.elf
>
> There are no relocations in this file.
>
> When compiling with -static-pie -PIE -Os, the same code is emitted *but*
> the binary is relocatable:
>
> $ readelf -r test_encl.elf
>
> Relocation section '.rela.dyn' at offset 0x4000 contains 12 entries:
> Offset Info Type Sym. Value Sym. Name
> + Addend
> # tcs1.ossa
> 000000000010 000000000008 R_X86_64_RELATIVE 7000
> # tcs1.oentry
> 000000000020 000000000008 R_X86_64_RELATIVE 21e3
> # tcs2.ossa
> 000000001010 000000000008 R_X86_64_RELATIVE 8000
> # tcs2.oentry
> 000000001020 000000000008 R_X86_64_RELATIVE 21e3
> # encl_op_array
> 000000006000 000000000008 R_X86_64_RELATIVE 2098
> 000000006008 000000000008 R_X86_64_RELATIVE 20ae
> 000000006010 000000000008 R_X86_64_RELATIVE 20c4
> 000000006018 000000000008 R_X86_64_RELATIVE 20d7
> 000000006020 000000000008 R_X86_64_RELATIVE 20ea
> 000000006028 000000000008 R_X86_64_RELATIVE 203e
> 000000006030 000000000008 R_X86_64_RELATIVE 2000
> 000000006038 000000000008 R_X86_64_RELATIVE 20ef
>
> Apparently, for static-pie binaries, glibc includes a
> _dl_relocate_static_pie routine that will perform the relocations as
> part of the startup [2,3].
>
I am not sure whether all those 'rela.dyn' matters due to the reason you
mentioned below ...
> Since the enclave loading process is
> different and glibc is not included, we cannot rely on these relocations
> to be performed and we need to do them manually.
>
... here.
Those relocation table are not used by enclave builder anyway. Only ".tsc"
".text" and ".data" + some heap are built as enclave.
> Note: the first 4
> symbols in the relocation table above are from the TCS initialization in
> test_encl_bootstrap.S and should *not* be relocated (as these are
> relative to enclave base as per SGX spec).
I don't quite follow here. Per my understanding TCS pages can be any page
within the enclave. I don't quite understand what does "relocated" mean here.
>
> Bottom line: the way I see it, the enclave should either ensure no
> relocations are needed, or perform the relocations manually where
> needed, or include a _dl_relocate_static_pie equivalent that parses the
> .rela.dyn ELF section and patches all relocations (minus the TCS
> symbols). Since the latter (most general) approach is likely going to
> make the selftest enclave unnecessarily complex by including ELF parsing
> etc, I propose to simply relocate the function-pointer table manually
> (which is indeed the only place that needs relocations).
I think we are kinda mixing two things together: 1) the "relocation" supported
by the "non-enclave" normal case, where the compiler/linker generates the PIC
code, and the loader does the "runtime" fixup for those in the "rela.dyn"; 2)
the "relocation" for the enclave case, where the compiler/linker still generates
the PIC code, but the "enclave loader" loads the enclave into a random virtual
address of the process.
Obviously the "enclave loader" (implemented in this selftests/sgx/...) isn't as
powerful as the "real loader" in the normal case. In fact, reading the code,
IIUC, it simply gathers ".tsc"/".text"/".data" sections from the ELF file (and
plus some heap) and load them into the enclave.
Now the important thing is: those sections are "contiguous" in the enclave.
That means the kernel needs to build the enclave ELF file with those sections
"contiguously" in the same order too as a single piece, and this single piece
can work at any random address that the "enclave loader" loads the enclave. Any
address fixing up due to different location of ".data"/".tsc" section at loading
time cannot be generated.
This should be the thing that we need to focus on.
That being said, I think ideally there shouldn't be _ANY_ "rela.dyn" in the
enclave ELF file.
>
> I will include code to properly compile the selftest enclave with
> -static-pie as per above and relocate the function-pointer table in the
> next patch revision.
I agree we should use "-static-pie" + "-fPIE" (or "-fPIC" is also OK??).
However I am yet not convinced the "relocate function-pointer" thing. If you
can paste the relevant code it would be helpful.
Or am I missing something big?