Re: [PATCH] x86: vdso32/syscall.S: do not load __USER32_DS to %ss

From: Andy Lutomirski
Date: Wed Mar 25 2015 - 11:18:26 EST


On Wed, Mar 25, 2015 at 8:03 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> On 03/25/2015 10:28 AM, Ingo Molnar wrote:
>>
>> * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>
>>> Now we can do a fun hack on top. On Intel, we have
>>> sysenter/sysexitl and, on AMD, we have syscall/sysretl. But, if I
>>> read the docs right, Intel has sysretl, too. So we can ditch
>>> sysexit entirely, since this mechanism no longer has any need to
>>> keep the entry and exit conventions matching.
>>
>> So this only affects 32-bit vdsos, because on 64-bit both Intel and
>> AMD have and use SYSCALL/SYSRET.
>>
>> So my question would be: what's the performance difference between
>> INT80 and sysenter entries on 32-bit, on modern CPUs?
>>
>> If it's not too horrible (say below 100 cycles) then we could say that
>> we start out the simplification and robustification by switching Intel
>> over to INT80 + SYSRET on 32-bit, and once we know the 32-bit SYSRET
>> and all the other simplifications work fine we implement the
>> SYSENTER-hack on top of that?
>
> int 0x80 is about 250 cycles slower than syscall/sysenter.
> (I mean, the instruction per se, not the full round-trip).
> This looks too horrible to ignore :(

Agreed.

>
>
>> Is there any user-space code that relies on being able to execute an
>> open coded SYSENTER, or are we shielded via the vDSO?
>
> Userspace can't use open-coded sysenter. It will return to a different
> address.
>
> Userspace _can_ do this:
>
> my_sysenter:
> push %ecx
> push %edx
> push %ebp
> movl %esp,%ebp
> sysenter
> /* end of my_sysenter() */
>
> ...
> ...
> ...
>
> call my_sysenter
>
> but this depends on matching stack layout with one used by vDSO.
>
>

I'd be rather surprised if anyone does that. It'll die with SIGILL on
AMD systems. Similarly, open-coded syscall instructions in 32-bit
code will die with SIGILL on Intel systems.

Gee thanks, anyone.

<with time machine>The correct way to do this ought to have been
straightforward. Syscall should have stashed eip/rip in r8, r9, or
r10, and sysenter shouldn't exist in long mode. All of this mess
would just disappear completely.</with time machine>

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/