Re: [PATCH v5 8/9] x86-64: Emulate legacy vsyscalls

From: Brian Gerst
Date: Mon Jun 06 2011 - 10:07:58 EST

On Mon, Jun 6, 2011 at 9:58 AM, <pageexec@xxxxxxxxxxx> wrote:
> On 6 Jun 2011 at 8:43, Andrew Lutomirski wrote:
>> >> and it's less flexible
>> >
>> > why? as in, what kind of flexibility do you need that int xx can provide but a page
>> > fault cannot?
>> The ability to make time() fast when configured that way.
> true, nx and fast time() at vsyscall addresses will never mix. but it's a temporary
> problem for anyone who cares, a trivial glibc patch fixes it.
>> >> and it could impact a fast path in the kernel.
>> >
>> > a page fault is never a fast path, after all the cpu has just taken an exception
>> > (vs. the syscall/sysenter style actually fast user->kernel transition) and is
>> > about to make page table changes (and possibly TLB flushes).
>> Sure it is. ÂIt's a path that's optimized carefully and needs to be as
>> fast as possible. ÂJust because it's annoyingly slow doesn't mean we
>> get to make it even slower.
> sorry, but stating that the pf handler is a fast path doesn't make it so ;).
> the typical pf is caused by userland to either fill in non-present pages
> or do c-o-w, a few well predicted conditional branches in those paths are
> simply not measurable (actually, those conditional branches would not be
> on those paths, at least they aren't in PaX). seriously, try it ;).
>> >> > another thing to consider for using the int xx redirection scheme (speaking
>> >> > of which, it should just be an int3):
>> >>
>> >> Why? Â0xcd 0xcc traps no matter what offset you enter it at.
>> >
>> > but you're wasting/abusing an IDT entry for no real gain (and it's lots of code
>> > for such a little change). also placing sw interrupts among hw ones is what can
>> > result in (ab)use like this:
>> I think it's less messy than mucking with the page fault handler.
> do you know what that mucking looks like? ;) prepare for the most complex code
> you've ever seen (it's in __bad_area_nosemaphore):
> Â779 #ifdef CONFIG_X86_64
> Â780 ÂÂÂÂÂÂÂÂif (mm && (error_code & PF_INSTR) && mm->context.vdso) {
> Â781 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (regs->ip == (unsigned long)vgettimeofday) {
> Â782 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂregs->ip = (unsigned long)VDSO64_SYMBOL(mm->context.vdso, gettimeofday);
> Â784 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ} else if (regs->ip == (unsigned long)vtime) {
> Â785 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂregs->ip = (unsigned long)VDSO64_SYMBOL(mm->context.vdso, clock_gettime);
> Â787 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ} else if (regs->ip == (unsigned long)vgetcpu) {
> Â788 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂregs->ip = (unsigned long)VDSO64_SYMBOL(mm->context.vdso, getcpu);
> Â792 #endif

I like this approach, however since we're already in the kernel it
makes sense just to run the normal syscall instead of redirecting to
the vdso.

Brian Gerst
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at