Re: [PATCH] x86/asm/entry/64: better check for canonical address

From: Brian Gerst
Date: Fri Mar 27 2015 - 07:28:48 EST


On Fri, Mar 27, 2015 at 7:17 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>
>> On 03/27/2015 09:11 AM, Ingo Molnar wrote:
>> >
>> > * Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>> >
>> >> This change makes the check exact (no more false positives
>> >> on kernel addresses).
>> >>
>> >> It isn't really important to be fully correct here -
>> >> almost all addresses we'll ever see will be userspace ones,
>> >> but OTOH it looks to be cheap enough:
>> >> the new code uses two more ALU ops but preserves %rcx,
>> >> allowing to not reload it from pt_regs->cx again.
>> >> On disassembly level, the changes are:
>> >>
>> >> cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11
>> >> shr $0x2f,%rcx -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11
>> >> mov 0x58(%rsp),%rcx -> (eliminated)
>> >>
>> >> Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
>> >> CC: Borislav Petkov <bp@xxxxxxxxx>
>> >> CC: x86@xxxxxxxxxx
>> >> CC: linux-kernel@xxxxxxxxxxxxxxx
>> >> ---
>> >>
>> >> Andy, I'd undecided myself on the merits of doing this.
>> >> If you like it, feel free to take it in your tree.
>> >> I trimmed CC list to not bother too many people with this trivial
>> >> and quite possibly "useless churn"-class change.
>> >>
>> >> arch/x86/kernel/entry_64.S | 23 ++++++++++++-----------
>> >> 1 file changed, 12 insertions(+), 11 deletions(-)
>> >>
>> >> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> >> index bf9afad..a36d04d 100644
>> >> --- a/arch/x86/kernel/entry_64.S
>> >> +++ b/arch/x86/kernel/entry_64.S
>> >> @@ -688,26 +688,27 @@ retint_swapgs: /* return to user-space */
>> >> * a completely clean 64-bit userspace context.
>> >> */
>> >> movq RCX(%rsp),%rcx
>> >> - cmpq %rcx,RIP(%rsp) /* RCX == RIP */
>> >> + movq RIP(%rsp),%r11
>> >> + cmpq %rcx,%r11 /* RCX == RIP */
>> >> jne opportunistic_sysret_failed
>> >
>> > Btw., in the normal syscall entry path, RIP(%rsp) == RCX(%rsp),
>> > because we set up pt_regs like that - and at this point RIP/RCX is
>> > guaranteed to be canonical, right?
>> >
>> > So if there's a mismatch generated, it's the kernel's doing.
>>
>> This is an optimization on IRET exit code path.
>>
>> We go here if we know that pt_regs can be modified by .e.g. ptrace.
>>
>> I think we also go here even on interrupt return.
>
> Yeah, missed that, this would kill any flag based approach.
>
>> (Granted, chances that RCX was the same as RIP at the moment of
>> interrupt are slim, but we still would check that and (ab)use SYSRET
>> if it looks like it'll work).
>
> Btw., there's a neat trick we could do: in the HLT, MWAIT and
> ACPI-idle code we could attempt to set up RCX to match RIP, to trigger
> this optimization in the common 'irq interrupted the idle task' case?

sysret only returns to CPL3.

--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/