SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crapwith vdso on uml/i386)

From: Al Viro
Date: Sun Aug 21 2011 - 04:43:04 EST


On Sun, Aug 21, 2011 at 07:34:43AM +0100, Al Viro wrote:
> Suppose we have a traced process. foo6() is called and the thing it
> stopped before the sys_foo6() is reached kernel-side. The sixth argument
> is on stack, ebp is set to user esp. SYSENTER happens, we read the
> 6th argument from userland stack and put it along with the rest into
> pt_regs. tracer examines the arguments, modifies them (including the last
> one) and lets the tracee run free - e.g. detaches from the tracee.
>
> What should happen if we happen to get a signal that would restart that
> sucker? Granted, it's not going to happen with mmap() - it doesn't, AFAICS,
> do anything of that kind. However, I wouldn't bet a dime on other 6-argument
> syscalls not stepping on that. sendto() and recvfrom(), in particular...
>
> OK, we return to userland. The sixth argument is placed into %ebp. Linus'
> "pig and proud of that" trick works and we end up slapping userland
> %esp into %ebp and hitting SYSENTER again. Only one problem, though -
> the sixth argument on user stack is completely unaffected by what tracer
> had done. Unlike the rest of arguments, that *are* changed.
>
> We could deal with that in case of SYSENTER if we e.g. replaced that
> jmp .Lenter_kernel
> with
> jmp .Lrestart
> and added
> .Lrestart:
> movl %ebp, (%esp)
> jmp .Lenter_kernel
> but in case of SYSCALL it seems to be even messier... Comments?

Oh, hell... Compat SYSCALL one is really buggered on syscall restarts,
ptrace or no ptrace. Look: calling conventions for SYSCALL are
arg1..5: ebx, ebp, edx, edi, esi. arg6: stack
and after syscall restart we end up with
arg1..5: ebx, ecx, edx, edi, esi. arg6: ebp
so restart will instantly clobber arg2, in effect replacing it with arg6.

And yes, adding ptrace to the mix makes things even uglier. For one thing,
changes to ECX via ptrace are completely lost on the fast exit. Not pretty,
and might make life painful for uml, but not for the majority of programs.
What's worse, combination of ptrace with restart will lose changes to arg6
(again, value on stack left as it was, changes to arg6 by tracer lost) *and*
it will lose changes to arg2 (along with arg2 itself - see above).

Linus' Dirty Trick(tm) is not trivial to apply - with SYSCALL we *do* retain
the address of next insn and that's where we end up going. IOW, SYSCALL not
inside vdso32 currently works (for small values of "works", due to restart
issues). Playing with return elsewhere might break some userland code...

Guys, that's *way* out of the area I'm comfortable with.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/