Re: aarch64 binaries using nolibc segfault before reaching the entry point

From: Thomas Weißschuh
Date: Wed Sep 13 2023 - 16:19:05 EST


On 2023-09-13 20:44:59+0200, Sebastian Ott wrote:
> Hi,
>
> the tpidr2 selftest on an arm box segfaults before reaching the entry point.
> I have no clue what is to blame for this or how to debug it but for a
> statically linked binary there shouldn't be much stuff going on besides the
> elf loader?
>
> I can reproduce this with a program using an empty main function. Also checked
> for other nolibc users - same result for init.c from rcutorture.
>
> tools/testing/selftests/arm64/fp/za-fork is working though - the only
> difference I could spot here is that it is linked together with another object
> file. I also looked at the elf headers but didn't find anything obvious (but
> I'm a bit out of my comfort zone here..)
>
> After playing around with linker options I found that using -static-pie
> lets the binaries run successful.
>
> [root@arm abi]# cat test.c
> int main(void)
> {
> return 1;
> }
> [root@arm abi]# gcc -Os -static -Wall -lgcc -nostdlib -ffreestanding -include ../../../../include/nolibc/nolibc.h test.c
> [root@arm abi]# ./a.out Segmentation fault
> [root@arm abi]# gcc -Os -static -Wall -lgcc -nostdlib -ffreestanding -static-pie -include ../../../../include/nolibc/nolibc.h test.c
> [root@arm abi]# ./a.out [root@arm abi]#
>
> All on aarch64 running fedora37 + upstream kernel. Any hints on what could
> be borken here or how to actually fix it?

I reduced it to the following reproducer:

$ cat test.c
int foo; /* It works when deleting this variable */

void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
{
__asm__ volatile (
"mov x8, 93\n" /* NR_exit == 93 */
"svc #0\n"
);
__builtin_unreachable();
}

$ aarch64-linux-gnu-gcc -Os -static -fno-stack-protector -Wall -nostdlib test.c
$ ./a.out
Segmentation fault

Also when running under gdb the error message is:

During startup program terminated with signal SIGSEGV, Segmentation fault.

So it seems the error already happens during loading.

Could be a compiler or kernel bug?

Thomas