Re: [PATCH v2 4/4] x86/static_call: Add inline static call implementation for x86-64

From: Josh Poimboeuf
Date: Thu Nov 29 2018 - 11:33:51 EST


On Thu, Nov 29, 2018 at 03:38:53PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 29, 2018 at 05:37:39AM -0800, Andy Lutomirski wrote:
> >
> >
> > > On Nov 29, 2018, at 1:42 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Nov 28, 2018 at 10:05:54PM -0800, Andy Lutomirski wrote:
> > >
> > >>>> +static void static_call_bp_handler(struct pt_regs *regs, void *_data)
> > >>>> +{
> > >>>> + struct static_call_bp_data *data = _data;
> > >>>> +
> > >>>> + /*
> > >>>> + * For inline static calls, push the return address on the stack so the
> > >>>> + * "called" function will return to the location immediately after the
> > >>>> + * call site.
> > >>>> + *
> > >>>> + * NOTE: This code will need to be revisited when kernel CET gets
> > >>>> + * implemented.
> > >>>> + */
> > >>>> + if (data->ret) {
> > >>>> + regs->sp -= sizeof(long);
> > >>>> + *(unsigned long *)regs->sp = data->ret;
> > >>>> + }
> > >>
> > >> You canât do this. Depending on the alignment of the old RSP, which
> > >> is not guaranteed, this overwrites regs->cs. IRET goes boom.
> > >
> > > I don't get it; can you spell that out?
> > >
> > > The way I understand it is that we're at a location where a "E8 - Near
> > > CALL" instruction should be, and thus RSP should be the regular kernel
> > > stack, and the above simply does "PUSH ret", which is what that CALL
> > > would've done too.
> > >
> >
> > int3 isnât IST anymore, so the int3 instruction conditionally
> > subtracts 8 from RSP and then pushes SS, etc. So my email was
> > obviously wrong wrt âcsâ, but youâre still potentially overwriting the
> > int3 IRET frame.
>
> ARGH!..
>
> can't we 'fix' that again? The alternative is moving that IRET-frame and
> fixing everything up, which is going to be fragile, ugly and such
> things more.
>
> Commit d8ba61ba58c8 ("x86/entry/64: Don't use IST entry for #BP stack")
> doesn't list any strong reasons for why it should NOT be an IST.

This seems to work...

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ce25d84023c0..184523447d35 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -876,7 +876,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt
* @paranoid == 2 is special: the stub will never switch stacks. This is for
* #DF: if the thread stack is somehow unusable, we'll still get a useful OOPS.
*/
-.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
+.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 create_gap=0
ENTRY(\sym)
UNWIND_HINT_IRET_REGS offset=\has_error_code*8

@@ -891,6 +891,12 @@ ENTRY(\sym)
pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif

+ .if \create_gap == 1
+ .rept 6
+ pushq 5*8(%rsp)
+ .endr
+ .endif
+
.if \paranoid == 1
testb $3, CS-ORIG_RAX(%rsp) /* If coming from userspace, switch stacks */
jnz .Lfrom_usermode_switch_stack_\@
@@ -1126,7 +1132,7 @@ apicinterrupt3 HYPERV_STIMER0_VECTOR \
#endif /* CONFIG_HYPERV */

idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
-idtentry int3 do_int3 has_error_code=0
+idtentry int3 do_int3 has_error_code=0 create_gap=1
idtentry stack_segment do_stack_segment has_error_code=1

#ifdef CONFIG_XEN_PV