Re: [PATCH] nolibc: optimise _start() on x86_64

From: Alexey Dobriyan
Date: Sun Dec 03 2023 - 07:00:56 EST


On Sat, Dec 02, 2023 at 02:23:59PM +0100, Willy Tarreau wrote:
> Hi Alexey,
>
> On Sat, Dec 02, 2023 at 03:45:13PM +0300, Alexey Dobriyan wrote:
> > Just jump into _start_c, it is not going to return anyway.
>
> Thanks, but what's upper in the stack there ?

argc

(gdb) break _start
(gdb) run

(gdb) x/20gx $sp
0x7fffffffdae0: 0x0000000000000004 0x00007fffffffdf33
0x7fffffffdaf0: 0x00007fffffffdf49 0x00007fffffffdf4b
0x7fffffffdb00: 0x00007fffffffdf4d 0x0000000000000000
0x7fffffffdb10: 0x00007fffffffdf4f 0x00007fffffffdf70
0x7fffffffdb20: 0x00007fffffffdf80 0x00007fffffffdfce

(gdb) x/s 0x00007fffffffdf33
0x7fffffffdf33: "/home/ad/s-test/a.out"

> I'm trying to make sure
> that if _start_c returns we don't get a random behavior.

Yes, it should segfault executing from very small address.
I tested with

.intel_syntax noprefix
.globl _start
_start:
ret
mov eax, 231
xor edi, edi
syscall

> If we get a
> systematic crash (e.g. 0 always there) that's fine, what would be
> annoying would be random infinite loops etc. In the psABI description
> (table 3.9) I'm seeing "undefined" before argc, which I don't find
> much appealing.
>
> > Signed-off-by: Alexey Dobriyan <adobriyan@xxxxxxxxx>
> > ---
> >
> > Also, kernel clears all registers before starting process,
> > I'm not sure why
> >
> > xor ebp, ebp
> >
> > was added.
>
> Hmmm psABI says:
>
> Only the registers listed below have specied values at process entry:
>
> %rbp The content of this register is unspecied at process initialization
> time, but the user code should mark the deepest stack frame by setting
> the frame pointer to zero.
>
> %rsp The stack pointer holds the address of the byte with lowest address
> which is part of the stack. It is guaranteed to be 16-byte aligned at
> process entry.
>
> %rdx a function pointer that the application should register with atexit (BA_OS).
>
> Thus apparently it's documented as being our job to clear it :-/

I meant, ELF loader clears all registers except rsp and aligns the stack to 16 bytes.
There were problems with stack aligning, but registers, I think, were always zeroed.