Re: [PATCH v5 8/9] x86-64: Emulate legacy vsyscalls

From: Brian Gerst
Date: Mon Jun 06 2011 - 10:07:58 EST


On Mon, Jun 6, 2011 at 9:58 AM, <pageexec@xxxxxxxxxxx> wrote:
> On 6 Jun 2011 at 8:43, Andrew Lutomirski wrote:
>
>> >> and it's less flexible
>> >
>> > why? as in, what kind of flexibility do you need that int xx can provide but a page
>> > fault cannot?
>>
>> The ability to make time() fast when configured that way.
>
> true, nx and fast time() at vsyscall addresses will never mix. but it's a temporary
> problem for anyone who cares, a trivial glibc patch fixes it.
>
>> >> and it could impact a fast path in the kernel.
>> >
>> > a page fault is never a fast path, after all the cpu has just taken an exception
>> > (vs. the syscall/sysenter style actually fast user->kernel transition) and is
>> > about to make page table changes (and possibly TLB flushes).
>>
>> Sure it is. ÂIt's a path that's optimized carefully and needs to be as
>> fast as possible. ÂJust because it's annoyingly slow doesn't mean we
>> get to make it even slower.
>
> sorry, but stating that the pf handler is a fast path doesn't make it so ;).
> the typical pf is caused by userland to either fill in non-present pages
> or do c-o-w, a few well predicted conditional branches in those paths are
> simply not measurable (actually, those conditional branches would not be
> on those paths, at least they aren't in PaX). seriously, try it ;).
>
>> >> > another thing to consider for using the int xx redirection scheme (speaking
>> >> > of which, it should just be an int3):
>> >>
>> >> Why? Â0xcd 0xcc traps no matter what offset you enter it at.
>> >
>> > but you're wasting/abusing an IDT entry for no real gain (and it's lots of code
>> > for such a little change). also placing sw interrupts among hw ones is what can
>> > result in (ab)use like this:
>>
>> I think it's less messy than mucking with the page fault handler.
>
> do you know what that mucking looks like? ;) prepare for the most complex code
> you've ever seen (it's in __bad_area_nosemaphore):
>
> Â779 #ifdef CONFIG_X86_64
> Â780 ÂÂÂÂÂÂÂÂif (mm && (error_code & PF_INSTR) && mm->context.vdso) {
> Â781 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (regs->ip == (unsigned long)vgettimeofday) {
> Â782 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂregs->ip = (unsigned long)VDSO64_SYMBOL(mm->context.vdso, gettimeofday);
> Â783 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂreturn;
> Â784 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ} else if (regs->ip == (unsigned long)vtime) {
> Â785 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂregs->ip = (unsigned long)VDSO64_SYMBOL(mm->context.vdso, clock_gettime);
> Â786 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂreturn;
> Â787 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ} else if (regs->ip == (unsigned long)vgetcpu) {
> Â788 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂregs->ip = (unsigned long)VDSO64_SYMBOL(mm->context.vdso, getcpu);
> Â789 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂreturn;
> Â790 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ}
> Â791 ÂÂÂÂÂÂÂÂ}
> Â792 #endif

I like this approach, however since we're already in the kernel it
makes sense just to run the normal syscall instead of redirecting to
the vdso.

--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/