[PATCH] ptrace/x86: introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()

From: Oleg Nesterov
Date: Tue Nov 26 2019 - 06:07:14 EST


The comment in get_nr_restart_syscall() says:

* The problem is that we can get here when ptrace pokes
* syscall-like values into regs even if we're not in a syscall
* at all.

Yes. but if we are not in syscall then the

status & (TS_COMPAT|TS_I386_REGS_POKED)

check below can't really help:

- TS_COMPAT can't be set

- TS_I386_REGS_POKED is only set if regs->orig_ax was changed by
32bit debugger; and even in this case get_nr_restart_syscall()
is only correct if the tracee is 32bit too.

Suppose that 64bit debugger plays with 32bit tracee and

* Tracee calls sleep(2) // TS_COMPAT is set
* User interrupts the tracee by CTRL-C after 1 sec and does
"(gdb) call func()"
* gdb saves the regs by PTRACE_GETREGS
* does PTRACE_SETREGS to set %rip='func' and %orig_rax=-1
* PTRACE_CONT // TS_COMPAT is cleared
* func() hits int3.
* Debugger catches SIGTRAP.
* Restore original regs by PTRACE_SETREGS.
* PTRACE_CONT

get_nr_restart_syscall() wrongly returns __NR_restart_syscall==219, the
tracee calls ia32_sys_call_table[219] == sys_madvise.

This patch adds the sticky TS_COMPAT_RESTART flag which survives after
return to user mode, hopefully it allows us to kill TS_I386_REGS_POKED
but this needs another patch.

Test-case:

$ cvs -d :pserver:anoncvs:anoncvs@xxxxxxxxxxxxxx:/cvs/systemtap co ptrace-tests
$ gcc -o erestartsys-trap-debuggee ptrace-tests/tests/erestartsys-trap-debuggee.c --m32
$ gcc -o erestartsys-trap-debugger ptrace-tests/tests/erestartsys-trap-debugger.c -lutil
$ ./erestartsys-trap-debugger
Unexpected: retval 1, errno 22
erestartsys-trap-debugger: ptrace-tests/tests/erestartsys-trap-debugger.c:421

Reported-by: Jan Kratochvil <jan.kratochvil@xxxxxxxxxx>
Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
---
arch/x86/entry/common.c | 9 +++++++++
arch/x86/include/asm/thread_info.h | 1 +
arch/x86/kernel/signal.c | 6 ++----
3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 3f8e226..1e2c2ca 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -263,6 +263,15 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs)

rseq_syscall(regs);

+#ifdef CONFIG_IA32_EMULATION
+ if (regs->ax == -ERESTART_RESTARTBLOCK) {
+ struct thread_info *ti = current_thread_info();
+ if (ti->status & TS_COMPAT)
+ ti->status |= TS_COMPAT_RESTART;
+ else
+ ti->status &= ~TS_COMPAT_RESTART;
+ }
+#endif
/*
* First do one-time work. If these work items are enabled, we
* want to run them exactly once per syscall exit with IRQs on.
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index f945353..0784591 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -223,6 +223,7 @@ static inline int arch_within_stack_frames(const void * const stack,

#ifdef CONFIG_COMPAT
#define TS_I386_REGS_POKED 0x0004 /* regs poked by 32-bit ptracer */
+#define TS_COMPAT_RESTART 0x0008
#endif
#ifndef __ASSEMBLY__

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 8eb7193..1054f1c 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -788,12 +788,10 @@ static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
* For now, we maintain historical behavior and guess based on
* stored state. We could do better by saving the actual
* syscall arch in restart_block or (with caveats on x32) by
- * checking if regs->ip points to 'int $0x80'. The current
- * behavior is incorrect if a tracer has a different bitness
- * than the tracee.
+ * checking if regs->ip points to 'int $0x80'.
*/
#ifdef CONFIG_IA32_EMULATION
- if (current_thread_info()->status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (current_thread_info()->status & TS_COMPAT_RESTART)
return __NR_ia32_restart_syscall;
#endif
#ifdef CONFIG_X86_X32_ABI
--
2.5.0