[PATCH v3 3/3] x86/ptrace, x86/signal: Remove TS_I386_REGS_POKED

From: Andy Lutomirski
Date: Mon Jun 20 2016 - 20:05:55 EST


System call restart has some oddities wrt ptrace:

1. For whatever reason, the kernel delivers signals and triggers
ptrace before handling syscall restart. This means that
-ERESTART_RESTARTBLOCK, etc is visible to userspace. We could
plausibly get away with changing that, but it seems quite risky.

2. As a result of (1), gdb (quite reasonably) expects that it can
snapshot user state on signal delivery, adjust regs to call a
function, and then restore user state.

3. Presumably as a result of (2), we do syscall restart if indicated
by the register state on ptrace resume even if we're *not* resuming
a syscall.

4. Also as a result of (1), gdb expects that writing -1 to orig_eax
via POKEUSER or similar will *disable* syscall restart, which is
necessary to get function calling on syscall exit to work.

The combination of (1) and (4) means that, if we have a 32-bit tracer,
we need to skip syscall restart if orig_eax == -1 (in a 32-bit signed
sense). The combination of (1) and (2) means that, if we have a
32-bit tracer, we need to enable syscall restart if orig_eax > 0 (in a
32-bit signed sense) and eax contains a -ERESTART* code (again in a
signed sense).

The current state of affairs is a mess. Setting a temporary per-task
flag when ptrace changes orig_eax is messy. It does the wrong thing
when ptrace only writes eax. It's also seriously overcomplicated IMO.

Instead, just unconditionally sign-extending them in the ptrace code
and not worrying about ptrace in the signal handling code.

Cc: Pedro Alves <palves@xxxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
---
arch/x86/entry/common.c | 6 +-----
arch/x86/include/asm/syscall.h | 2 +-
arch/x86/include/asm/thread_info.h | 3 ---
arch/x86/kernel/ptrace.c | 37 +++++++++++++++++++++----------------
4 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 0db497a8ff19..ec138e538c44 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -270,12 +270,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
* handling, because syscall restart has a fixup for compat
* syscalls. The fixup is exercised by the ptrace_syscall_32
* selftest.
- *
- * We also need to clear TS_REGS_POKED_I386: the 32-bit tracer
- * special case only applies after poking regs and before the
- * very next return to user mode.
*/
- ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
+ ti->status &= ~TS_COMPAT;
#endif

user_enter();
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 4e23dd15c661..4216bb7cbcba 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -60,7 +60,7 @@ static inline long syscall_get_error(struct task_struct *task,
* TS_COMPAT is set for 32-bit syscall entries and then
* remains set until we return to user mode.
*/
- if (task_thread_info(task)->status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (task_thread_info(task)->status & TS_COMPAT)
/*
* Sign-extend the value so (int)-EFOO becomes (long)-EFOO
* and will match correctly in comparisons.
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 4bca518d11f4..30c133ac05cd 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -228,9 +228,6 @@ static inline unsigned long current_stack_pointer(void)
* have to worry about atomic accesses.
*/
#define TS_COMPAT 0x0002 /* 32bit syscall active (64BIT)*/
-#ifdef CONFIG_COMPAT
-#define TS_I386_REGS_POKED 0x0004 /* regs poked by 32-bit ptracer */
-#endif
#define TS_RESTORE_SIGMASK 0x0008 /* restore signal mask in do_signal() */

#ifndef __ASSEMBLY__
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f79576a541ff..c95aba795f88 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -891,6 +891,10 @@ long arch_ptrace(struct task_struct *child, long request,
case offsetof(struct user32, regs.l): \
regs->q = value; break

+#define R32_SIGNED(l,q) \
+ case offsetof(struct user32, regs.l): \
+ regs->q = (long)(s32)value; break
+
#define SEG32(rs) \
case offsetof(struct user32, regs.rs): \
return set_segment_reg(child, \
@@ -917,25 +921,26 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value)
R32(edi, di);
R32(esi, si);
R32(ebp, bp);
- R32(eax, ax);
R32(eip, ip);
R32(esp, sp);

- case offsetof(struct user32, regs.orig_eax):
- /*
- * Warning: bizarre corner case fixup here. A 32-bit
- * debugger setting orig_eax to -1 wants to disable
- * syscall restart. Make sure that the syscall
- * restart code sign-extends orig_ax. Also make sure
- * we interpret the -ERESTART* codes correctly if
- * loaded into regs->ax in case the task is not
- * actually still sitting at the exit from a 32-bit
- * syscall with TS_COMPAT still set.
- */
- regs->orig_ax = value;
- if (syscall_get_nr(child, regs) >= 0)
- task_thread_info(child)->status |= TS_I386_REGS_POKED;
- break;
+ /*
+ * A 32-bit ptracer has the following expectations:
+ *
+ * - Storing -1 (i.e. 0xffffffff) to orig_eax will prevent
+ * syscall restart handling.
+ *
+ * - Restoring regs saved on exit from an interrupted
+ * restartable syscall will trigger syscall restart. Such
+ * regs will have non-negative orig_eax and negative eax.
+ *
+ * The kernel's syscall restart code treats regs->orig_ax and
+ * regs->ax as 64-bit signed quantities. 32-bit user code
+ * doesn't care about the high bits. Keep it simple and just
+ * sign-extend both values.
+ */
+ R32_SIGNED(orig_eax, orig_ax);
+ R32_SIGNED(eax, ax);

case offsetof(struct user32, regs.eflags):
return set_flags(child, value);
--
2.5.5