Re: [PATCH 1/3] x86/entry: Clear extra registers beyond syscall arguments for 64bit kernels

From: Ingo Molnar
Date: Mon Feb 05 2018 - 14:48:42 EST



* Brian Gerst <brgerst@xxxxxxxxx> wrote:

> On Mon, Feb 5, 2018 at 1:29 PM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > * Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> >> [...] Clearing R10 is mostly useless in the syscall path because we'll just
> >> unconditionally reload it in do_syscall_64().
> >
> > AFAICS do_syscall_64() doesn't touch R10 at all. So how does it reload R10?
> >
> > In fact do_syscall_64() as a C function does not touch R10, R11, R12, R13, R14,
> > R15 - it passes their values through.
> >
> > What am I missing?
>
> The syscall ABI uses R10 for the 4th argument instead of RCX, because
> RCX gets clobbered by the SYSCALL instruction for RIP.

But we only reload the syscall-entry value of R10 it into RCX (4th C function
argument):

regs->ax = sys_call_table[nr](
regs->di, regs->si, regs->dx,
regs->r10, regs->r8, regs->r9);

while RCX is a clobbered register, so in practice, while it will be briefly
present in do_syscall_64() and the high level syscall functions, the value in RCX
will be cleared from RCX in the overwhelming majority of cases.

But the real R10 will survive much longer, because it's only used in a very small
minority of the C functions!

So my point: if we clear R10 (and R11) from the _real_ registers, we can stop
propagating these user controlled values further into the kernel.

Thanks,

Ingo