Re: [PATCH v4 0/6] Micro-optimize vclock_gettime
From: Ingo Molnar
Date: Wed May 18 2011 - 04:32:27 EST
* Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > And time() and sched_getcpu() call the vsyscall page unconditionally.
>
> Dammit, time() is a real problem. I missed that and thought that it's
> gettimeofday() alone for the static case. sched_getcpu() is nothing to worry
> about.
There's a relatively simple solution for all this:
- We can make the old vsyscall page contain an int $0x81 (it is a free vector)
- We can use vector 0x81 as a wrapper around the int80 entry: it would check
the syscall nrs and return if it's outside the small number of permitted
syscalls
- We can put this behind a straightforward CONFIG_COMPAT_VSYSCALL=y option,
enabled by default for compatibility.
- Distros that fix glibc can turn it off
Costs:
- the performance cost of this solution is minimal: weirdly built binaries on
unfixed glibc will have a handful of syscalls execute via int $0x81 not the
syscall instruction. The cost of that is +50 nsecs at most - not 500.
- almost zero maintenance cost: it just wraps existing int80 logic. It does not
even have to use any kernel stack, it only checks register arguments so the
code is truly small and trivial to keep secure.
Advantages:
- we defang the constant-address syscall instruction this way - it cannot be
used for anything even remotely useful to an exploit.
- it's very simple
- there's a future path out of it and a future path to deprecate this
What do you think?
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/