Re: [PATCH] tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list

From: Ammar Faizi
Date: Tue Oct 12 2021 - 04:37:54 EST


On Tue, Oct 12, 2021 at 12:28 PM Willy Tarreau <w@xxxxxx> wrote:
>
> Hello Ammar,
>
> First, thanks for your patch. I have a few questions below.
>
> On Mon, Oct 11, 2021 at 11:03:44AM +0700, Ammar Faizi wrote:
> > Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
> >
> >   - rax for the return value.
> >   - rcx to save the return address.
> >   - r11 to save the rflags.
> >
> > Other registers are preserved.
> >
> > Having r8, r9 and r10 in the syscall clobber list is harmless, but this
> > results in a missed-optimization.
> >
> > As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
> > their value after the syscall returns to userspace. But since they are
> > in the clobber list, GCC will always miss this opportunity.
>
> I agree with your conclusion about this but need to be perfectly sure
> about the exact clobber list since I got a different impression when
> implementing the calls. I got the impression that these ones were
> clobbered by reading entry_64.S below:
>
>  * Registers on entry:
>  * rax  system call number
>  * rcx  return address
>  * r11  saved rflags (note: r11 is callee-clobbered register in C ABI)
>  * rdi  arg0
>  * rsi  arg1
>  * rdx  arg2
>  * r10  arg3 (needs to be moved to rcx to conform to C ABI)
>  * r8   arg4
>  * r9   arg5
>  * (note: r12-r15, rbp, rbx are callee-preserved in C ABI)
>
> See this last note ? Indicating that r12-15, rbp and rbx are preserved
> made me think it's not the case for the other ones, but that might of
> course be a misunderstanding.
>
> And calling.h says this:
>
>  x86 function call convention, 64-bit:
>  -------------------------------------
>   arguments           |  callee-saved      | extra caller-saved | return
>  [callee-clobbered]   |                    | [callee-clobbered] |
>  ---------------------------------------------------------------------------
>  rdi rsi rdx rcx r8-9 | rbx rbp [*] r12-15 | r10-11             | rax, rdx [**]
>
> Note that it's indicated "function call convention", not "syscall",
> leaving it open to have extra restrictions on syscalls. Later by
> reading the POP_REGS macro called with pop_rdi=0 and skipr11rcx=1
> on syscall leave, it seems to restore everything but r11, rcx, rax
> and rdi (which is restored last with rsp).
>
> As such, could you please add in your commit message a link to the
> location where you found that authoritative information above it there
> is a better place (i.e. one that doesn't require to read all the macros)?
> This would significantly help anyone facing the same doubts about this
> in the future.

Hi Willy,

I have tried to search for the documentation about this one, but I
couldn't find any. Checking at `Documentation/x86/entry_64.rst`, but
it doesn't tell anything relevant.

Background story, I browsed the nolibc.h file and found extra clobbers
for Linux x86-64 syscall which I think they are unneccesary (r8, r9
and r10).

This finding had me worried a bit, because I have written syscall in
inline ASM, based on my understanding the required clobbers are "rcx",
"r11" and "memory". So in my mind was "Is my app safe?".

So I opened a discussion on Stack Overflow, yesterday:
https://stackoverflow.com/questions/69515893/when-does-linux-x86-64-syscall-clobber-r8-r9-and-r10

While waiting for answers on Stack Overflow, I also checked "entry_64.S"
and "calling.h". Now I strongly believe syscall only clobbers 3
registers, they are rax, rcx and r11. The answer and comments on SO
support it as well.

These two macros do the job:

// saves all GPRs and zero some of them
PUSH_AND_CLEAR_REGS

// restore all GPRs, except rdi, r11, rcx
POP_REGS pop_rdi=0 skip_r11rcx=1

// later rdi is also restored.

My stance comes from SO, Telegram group discussion, and reading source
code. Therefore, I don't think I can attach the link to it as
"authoritative information". Or can I?

When I sent this patch, I also added entry_64.S's maintainers to CC
list. In hope they can help to at least acknowledge it. Mainly because
I can't find the documentation from Linux that tells about this.

Andy, Thomas, Ingo, Borislav, H. Peter.

Could one of you shed some light so that I can attach the link to your
message in the commit message?

Thanks,

--
Ammar Faizi