Re: [PATCH v3 1/3] x86: Implement arch_prctl(ARCH_VSYSCALL_CONTROL) to disable vsyscall

From: Andy Lutomirski
Date: Thu Jan 13 2022 - 16:47:38 EST


On 1/5/22 08:02, Florian Weimer wrote:
Distributions struggle with changing the default for vsyscall
emulation because it is a clear break of userspace ABI, something
that should not happen.

The legacy vsyscall interface is supposed to be used by libcs only,
not by applications. This commit adds a new arch_prctl request,
ARCH_VSYSCALL_CONTROL, with one argument. If the argument is 0,
executing vsyscalls will cause the process to terminate. Argument 1
turns vsyscall back on (this is mostly for a largely theoretical
CRIU use case).

Newer libcs can use a zero ARCH_VSYSCALL_CONTROL at startup to disable
vsyscall for the process. Legacy libcs do not perform this call, so
vsyscall remains enabled for them. This approach should achieves
backwards compatibility (perfect compatibility if the assumption that
only libcs use vsyscall is accurate), and it provides full hardening
for new binaries.

The chosen value of ARCH_VSYSCALL_CONTROL should avoid conflicts
with other x86-64 arch_prctl requests. The fact that with
vsyscall=emulate, reading the vsyscall region is still possible
even after a zero ARCH_VSYSCALL_CONTROL is considered limitation
in the current implementation and may change in a future kernel
version.

Future arch_prctls requests commonly used at process startup can imply
ARCH_VSYSCALL_CONTROL with a zero argument, so that a separate system
call for disabling vsyscall is avoided.

Signed-off-by: Florian Weimer <fweimer@xxxxxxxxxx>
Acked-by: Andrei Vagin <avagin@xxxxxxxxx>
---
v3: Remove warning log message. Split out test.
v2: ARCH_VSYSCALL_CONTROL instead of ARCH_VSYSCALL_LOCKOUT. New tests
for the toggle behavior. Implement hiding [vsyscall] in
/proc/PID/maps and test it. Various other test fixes cleanups
(e.g., fixed missing second argument to gettimeofday).

arch/x86/entry/vsyscall/vsyscall_64.c | 7 ++++++-
arch/x86/include/asm/mmu.h | 6 ++++++
arch/x86/include/uapi/asm/prctl.h | 2 ++
arch/x86/kernel/process_64.c | 7 +++++++
4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index fd2ee9408e91..6fc524b9f232 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -174,6 +174,9 @@ bool emulate_vsyscall(unsigned long error_code,
tsk = current;
+ if (tsk->mm->context.vsyscall_disabled)
+ goto sigsegv;
+

Is there a reason you didn't just change the check earlier in the function to:

if (vsyscall_mode == NONE || current->mm->context.vsyscall_disabled)

Also, I still think the prctl should not be available if vsyscall=emulate. Either we should fully implement it or we should not implement. We could even do:

pr_warn_once("userspace vsyscall hardening request ignored because you have vsyscall=emulate. Unless you absolutely need vsyscall=emulate, update your system to use vsyscall=xonly.\n");

and thus encourage good behavior.

--Andy