Re: arm64: Register modification during syscall entry/exit stop

From: Will Deacon
Date: Tue May 19 2020 - 04:15:59 EST


Hi Keno,

On Mon, May 18, 2020 at 09:05:30PM -0400, Keno Fischer wrote:
> Continuing my theme of "weird things I encounter
> while trying to use ptrace on arm64", I ran into the
> effect of the following code in the syscall entry/exit
> reporting:
>
> ```
> /*
> * A scratch register (ip(r12) on AArch32, x7 on AArch64) is
> * used to denote syscall entry/exit:
> */
> regno = (is_compat_task() ? 12 : 7);
> saved_reg = regs->regs[regno];
> regs->regs[regno] = dir;
> ```
>
> This seems very weird to me. I can't think of any
> other architecture that does something similar
> (other than unicore32 apparently, but the ptrace
> support there seems like it might have just been
> copied from ARM). I'm able to work around this
> in my application, but it adds another stumbling block.

Yes, we inherited this from ARM and I think strace relies on it. In
hindsight, it is a little odd, although x7 is a parameter register in the
PCS and so it won't be live on entry to a system call.

> Some examples of things that happen:
> - Writes to x7 during syscall exit stops are ignored, so
> if the ptracer tries to emulate a setjmp-type thing, it
> might miss this register (ptracers sometimes like to do
> this to manually serialize execution between different
> threads, puppeteering a single thread of execution
> between different register states).
> - Reads from x7 are incorrect, so if the ptracer saves
> a register state and later tries to set it back to the task,
> it may get x7 incorrect, but user space may be expecting
> the register to be preserved (when might this happen? -
> consider a ptracer that wants to modify some syscall
> arguments, it modifies the arguments, restarts the syscall
> but then incurs a signal, so it tries to restore the original
> registers to let userspace deal with the signal without
> being confused - expect signal traps don't ignore x7
> modifications, so x7 may have been unexpectedly
> modified).
> - We now have seccomp traps, which kind of look and
> act like syscall-entry traps, but don't have this behavior,
> so it's not particularly reliable for ptracers to use.
>
> Furthermore, it seems unnecessary to me on modern
> kernels. We now have PTRACE_GET_SYSCALL_INFO,
> which exposes this information without lying to the ptracer
> about the tracee's registers.
>
> I understand, we can't just change this, since people may
> be relying on it, but I would like to propose adding a ptrace
> option (PTRACE_O_ARM_REGSGOOD?) that turns this
> behavior off. Now, I don't think we currently have any other
> arch-specific ptrace options, so maybe there is a different
> option that would be preferable (e.g. could be a different
> regset), but I do think it would be good to have a way to
> operate on the real x7 value. As I said, I can work around it,
> but hopefully I will be able to save a future implementer
> some headache.

I'm not opposed to extending ptrace so that we can try to wean people off
this interface, but I think we need some concrete situations where the
current behaviour actually causes a problem. Although the examples you've
listed above are interesting, I don't see why x7 is important in any of
them (and we only support up to 6 system call arguments).

Will