Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64
From: Josh Poimboeuf
Date: Thu Nov 29 2018 - 17:14:57 EST
On Thu, Nov 29, 2018 at 11:01:48PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 29, 2018 at 11:10:50AM -0600, Josh Poimboeuf wrote:
> > On Thu, Nov 29, 2018 at 08:59:31AM -0800, Andy Lutomirski wrote:
>
> > > (like pointing IP at a stub that retpolines to the target by reading
> > > the function pointer, a la the unoptimizable version), then okay, I
> > > guess, with only a small amount of grumbling.
> >
> > I tried that in v2, but Peter pointed out it's racy:
> >
> > https://lkml.kernel.org/r/20181126160217.GR2113@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> Ah, but that is because it is a global shared trampoline.
>
> Each static_call has it's own trampoline; which currently reads
> something like:
>
> RETPOLINE_SAFE
> JMP *key
>
> which you then 'defuse' by writing an UD2 on. _However_, if you write
> that trampoline like:
>
> 1: RETPOLINE_SAFE
> JMP *key
> 2: CALL_NOSPEC *key
> RET
>
> and have the text_poke_bp() handler jump to 2 (a location you'll never
> reach when you enter at 1), it will in fact work I think. The trampoline
> is never modified and not shared between different static_call's.
But after returning from the function to the trampoline, how does it
return from the trampoline to the call site? At that point there is no
return address on the stack.
--
Josh