Re: [PATCH] x86/retpoline/entry: Disable the entire SYSCALL64 fast path with retpolines on

From: Will Deacon
Date: Mon Jan 29 2018 - 08:19:49 EST


Hi Andy,

On Fri, Jan 26, 2018 at 10:23:23AM -0800, Andy Lutomirski wrote:
> On Fri, Jan 26, 2018 at 10:13 AM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Fri, Jan 26, 2018 at 10:07 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> Umm... What about other architectures? Or do you want SYSCALL_DEFINE...
> >> to be per-arch? I wonder how much would that "go through pt_regs" hurt
> >> on something like sparc...
> >
> > No, but I just talked to Will Deacon about register clearing on entry,
> > and so I suspect that arm64 might want something similar too.
> >
> > So I think some opt-in for letting architectures add their own
> > function would be good. Because it wouldn't be all architectures, but
> > it probably _would_ be more than just x86.
> >
> > You need to add architecture-specific "load argX from ptregs" macros anyway.
>
> I mocked that up, and it's straightforward. I ended up with something like:
>
> #define __ARCH_SYSCALL_ARGS(n, ...) (regs->di, ...)
>
> (obviously modified so it actually compiles.)
>
> The issue is that doing it this way gives us, effectively:
>
> long sys_foo(int a, int b)
> {
> body here;
> }
>
> long SyS_foo(const struct pt_regs *regs)
> {
> return sys_foo(regs->di, regs->si);
> }
>
> whereas what we want is *static* long sys_foo(...). So I could split
> the macros into:
>
> DEFINE_SYSCALL2(foo, ....)
>
> and
>
> DEFINE_EXTERN_SYSCALL2(foo, ...)
>
> or I could just fix up all the code that expects calling sys_foo()
> across files to work.

Another issue with this style of macro definition exists on architectures
where the calling convention needs you to carry state around depending on
how you packed the previous parameters. For example, on 32-bit ARM, 64-bit
values are passed in adjacent pairs of registers but the low numbered
register needs to be even. This is what stopped me from trying to use
existing helpers such as syscall_get_arguments to unpack the pt_regs
and it generally means that anything that says "get me argument n" is going
to require constructing arguments 0..n-1 first.

To do this properly I think we'll either need to pass back the size and
current register offset to the arch code, or just allow the thing to be
overridden per syscall (the case above isn't especially frequent).

Will