Re: [PATCH] x86/asm/entry/64: better check for canonical address

From: Ingo Molnar
Date: Fri Mar 27 2015 - 04:11:58 EST



* Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:

> This change makes the check exact (no more false positives
> on kernel addresses).
>
> It isn't really important to be fully correct here -
> almost all addresses we'll ever see will be userspace ones,
> but OTOH it looks to be cheap enough:
> the new code uses two more ALU ops but preserves %rcx,
> allowing to not reload it from pt_regs->cx again.
> On disassembly level, the changes are:
>
> cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11
> shr $0x2f,%rcx -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11
> mov 0x58(%rsp),%rcx -> (eliminated)
>
> Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
> CC: Borislav Petkov <bp@xxxxxxxxx>
> CC: x86@xxxxxxxxxx
> CC: linux-kernel@xxxxxxxxxxxxxxx
> ---
>
> Andy, I'd undecided myself on the merits of doing this.
> If you like it, feel free to take it in your tree.
> I trimmed CC list to not bother too many people with this trivial
> and quite possibly "useless churn"-class change.
>
> arch/x86/kernel/entry_64.S | 23 ++++++++++++-----------
> 1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index bf9afad..a36d04d 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -688,26 +688,27 @@ retint_swapgs: /* return to user-space */
> * a completely clean 64-bit userspace context.
> */
> movq RCX(%rsp),%rcx
> - cmpq %rcx,RIP(%rsp) /* RCX == RIP */
> + movq RIP(%rsp),%r11
> + cmpq %rcx,%r11 /* RCX == RIP */
> jne opportunistic_sysret_failed

Btw., in the normal syscall entry path, RIP(%rsp) == RCX(%rsp),
because we set up pt_regs like that - and at this point RIP/RCX is
guaranteed to be canonical, right?

So if there's a mismatch generated, it's the kernel's doing.

Why don't we detect those cases where a new return address is created
(ptrace, exec, etc.), check for canonicalness and add a TIF flag for
it (and add it to the work mask) and execute the IRET from the slow
path?

We already have a work-mask branch.

That would allow the removal of all these checks and canonization from
the fast return path! We could go straight to the SYSRET...

The frequency of exec() and ptrace() is 2-3 orders of magnitude lower
than the frequency of system calls, so this would be well worth it.

Am I missing anything?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/