Re: New vsyscall emulation breaks JITs

From: H. Peter Anvin
Date: Tue Aug 09 2011 - 21:49:53 EST

Greg Lueck <lueckintel@xxxxxxxxx> wrote:

>Yes, this sounds like a cleaner solution. What happens, though, if the
>system call is interrupted by a signal or by ptrace(ATTACH)? Does RIP
>point at the target of the RET instruction? Is it moved back to the
>entry of the vsyscall page? Does it point immediately after the
>SYSCALL instruction? GDB might also care about these details.
>> That's a fun corner case. Is the problem that you might receive a
>> signal while single-stepping?
>Actually, the situation is more difficult. The application may have
>received a signal while inside the gate, sometime before the SYSENTER
>trap. The signal context frame on the application's stack now has RIP
>pointing someplace inside the gate. At this point, Pin attaches to the
>native process, and it has no reasonable way to know about the saved
>context with this RIP value. Later, the application (running under
>Pin) will return from its handler and resume execution in the middle of
>the gate code. What can Pin do here? It' s too late to execute
>natively at the start of the gate. If Pin executes natively at the
>signal return point, Pin will lose control of the application and it
>will execute natively from that point forward.
>-- Greg
>From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>To: Andrew Lutomirski <luto@xxxxxxx>
>Cc: Greg Lueck <lueckintel@xxxxxxxxx>; H. Peter Anvin <hpa@xxxxxxxxx>;
>Andi Kleen <andi@xxxxxxxxxxxxxx>; "x86@xxxxxxxxxx" <x86@xxxxxxxxxx>;
>"linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>;
>"kimwooyoung@xxxxxxxxx" <kimwooyoung@xxxxxxxxx>
>Sent: Tuesday, August 9, 2011 6:36 PM
>Subject: Re: New vsyscall emulation breaks JITs
>On Tue, Aug 9, 2011 at 2:04 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
>> Here's a different proposal, then:
>> What if the kernel had the sequence:
>> mov $__NR_whatever,%eax
>> syscall
>> ret
>> in the vsyscall page but marked the vsyscall page NX.
>This sounds like a sound idea. And then the difference between "fast
>and native" and "slow and trapping" ends up literally being just the
>NX bit.
>Â Â Â Â Â Â Â Â Â Â Â Â Linus

The logical answer is that rip will point to the entry to the vsyscall page.
