Re: [PATCH] x86/retpoline/entry: Disable the entire SYSCALL64 fast path with retpolines on
From: Linus Torvalds
Date: Thu Jan 25 2018 - 15:55:02 EST
On Thu, Jan 25, 2018 at 12:04 PM, Brian Gerst <brgerst@xxxxxxxxx> wrote:
>
> Another extra step the slow path does is checking to see if ptregs is
> safe for SYSRET. I think that can be mitigated by moving the check to
> the places that do modify ptregs (ptrace, sigreturn, and exec) which
> would set a flag to force return with IRET if the modified regs do not
> satisfy the criteria for SYSRET.
I tried to do some profiling, and none of that shows up for me.
That said, what _also_ doesn't show up is the actual page table switch
on entry. And that seems to be because the per-pcu trampoline code
isn't captures by perf (or at least not shown). Oh well.
What _does_ show up a bit is this in prepare_exit_to_usermode():
#ifdef CONFIG_COMPAT
/*
* Compat syscalls set TS_COMPAT. Make sure we clear it before
* returning to user mode. We need to clear it *after* signal
* handling, because syscall restart has a fixup for compat
* syscalls. The fixup is exercised by the ptrace_syscall_32
* selftest.
*
* We also need to clear TS_REGS_POKED_I386: the 32-bit tracer
* special case only applies after poking regs and before the
* very next return to user mode.
*/
current->thread.status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
#endif
and I think the problem there is that it is unnecessarily dirtying
that cacheline. Afaik, those bits are already clear 99.999% of the
time.
So things would be better if that 'status' would be in the thread-info
(to keep cachelines close to the other stuff we already touch) and the
code should have something like
if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED)))
or whatever.
There might be other similar small tuning issues going on.
So there is room for improvement there in the slow path.
Linus