[tip:x86/asm] x86/asm/entry: Fix execve() and sigreturn() syscalls to always return via IRET

From: tip-bot for Brian Gerst
Date: Mon Mar 23 2015 - 08:21:13 EST


Commit-ID: 1daeaa315164c60b937f56fe3848d4328c358eba
Gitweb: http://git.kernel.org/tip/1daeaa315164c60b937f56fe3848d4328c358eba
Author: Brian Gerst <brgerst@xxxxxxxxx>
AuthorDate: Sat, 21 Mar 2015 18:54:21 -0400
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 23 Mar 2015 08:52:46 +0100

x86/asm/entry: Fix execve() and sigreturn() syscalls to always return via IRET

Both the execve() and sigreturn() family of syscalls have the
ability to change registers in ways that may not be compatabile
with the syscall path they were called from.

In particular, SYSRET and SYSEXIT can't handle non-default %cs and %ss,
and some bits in eflags.

These syscalls have stubs that are hardcoded to jump to the IRET path,
and not return to the original syscall path.

The following commit:

76f5df43cab5e76 ("Always allocate a complete "struct pt_regs" on the kernel stack")

recently changed this for some 32-bit compat syscalls, but introduced a bug where
execve from a 32-bit program to a 64-bit program would fail because it still returned
via SYSRETL. This caused Wine to fail when built for both 32-bit and 64-bit.

This patch sets TIF_NOTIFY_RESUME for execve() and sigreturn() so
that the IRET path is always taken on exit to userspace.

Signed-off-by: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1426978461-32089-1-git-send-email-brgerst@xxxxxxxxx
[ Improved the changelog and comments. ]
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/ia32/ia32_signal.c | 2 ++
arch/x86/include/asm/ptrace.h | 2 +-
arch/x86/include/asm/thread_info.h | 10 ++++++++++
arch/x86/kernel/process_32.c | 6 +-----
arch/x86/kernel/process_64.c | 1 +
arch/x86/kernel/signal.c | 2 ++
6 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index d0165c9..1f5e2b0 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -203,6 +203,8 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,

err |= restore_xstate_sig(buf, 1);

+ force_iret();
+
return err;
}

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 74bb2e0..83b874d 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -251,7 +251,7 @@ static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
*/
#define arch_ptrace_stop_needed(code, info) \
({ \
- set_thread_flag(TIF_NOTIFY_RESUME); \
+ force_iret(); \
false; \
})

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index ba115eb..0abf7ab 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -260,6 +260,16 @@ static inline bool is_ia32_task(void)
#endif
return false;
}
+
+/*
+ * Force syscall return via IRET by making it look as if there was
+ * some work pending. IRET is our most capable (but slowest) syscall
+ * return path, which is able to restore modified SS, CS and certain
+ * EFLAGS values that other (fast) syscall return instructions
+ * are not able to restore properly.
+ */
+#define force_iret() set_thread_flag(TIF_NOTIFY_RESUME)
+
#endif /* !__ASSEMBLY__ */

#ifndef __ASSEMBLY__
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 1b9963f..26c596d 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -206,11 +206,7 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp)
regs->ip = new_ip;
regs->sp = new_sp;
regs->flags = X86_EFLAGS_IF;
- /*
- * force it to the iret return path by making it look as if there was
- * some work pending.
- */
- set_thread_flag(TIF_NOTIFY_RESUME);
+ force_iret();
}
EXPORT_SYMBOL_GPL(start_thread);

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 97f5658..da8b745 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -239,6 +239,7 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip,
regs->cs = _cs;
regs->ss = _ss;
regs->flags = X86_EFLAGS_IF;
+ force_iret();
}

void
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index edcb862..eaa2c5e 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -108,6 +108,8 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,

err |= restore_xstate_sig(buf, config_enabled(CONFIG_X86_32));

+ force_iret();
+
return err;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/