Re: SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weirdcrap with vdso on uml/i386)

From: Al Viro
Date: Sun Aug 21 2011 - 21:17:17 EST


On Sun, Aug 21, 2011 at 08:44:12PM -0400, Andrew Lutomirski wrote:

> This is, IMO, gross -- if the values in pt_regs matched what they were
> when sysenter / syscall was issued, then we'd be fine -- we could
> restart the syscall and everything would work. Apparently ptrace
> users have a problem with that, so we're stuck with the "lie" (i.e.
> reporting values as of __kernel_vsyscall, not as of the actual kernel
> entry).

Um, _no_. If nothing else, pt_regs is seen by sys_.... And they don't
bloody know or care how the syscall had been entered.

> Which suggests an easy-ish fix: if sysenter is used or if syscall is
> entered from the EIP is is supposed to be entered from, then just
> change ip in the argument save to point to the int 0x80 instruction.
> This might also require tweaking the userspace stack. That way,
> restart would hit int 0x80 instead of syscall/sysenter and the
> registers are exactly as expected.

Huh? Actions after SYSENTER differ from those after int 0x80. If nothing
else, you would need to tweak saved userland stack pointer as well. It is
possible, but I seriously doubt that it's a better way to deal with that
mess. And in any case, SYSEXIT buggers CX/DX, so we'd need two separate
post-syscall sequences in vdso. Yucky... I really don't like it.

The really ugly part for the SYSCALL variant is that right now we *can*
do things like this:
read_it:
pushl %ebp
movl $__NR_read, %eax
movl $0, %ebx
movl $array, %ebp
movl $100, %edx
syscall
movl $__USER32_DS, %ecx
movl %ecx, %ss
popl %ebp
ret
anywhere in your userland and have it act as an equivalent of
int read_it(void)
{
return read(0, array, 100);
}

Is that ability a part of userland ABI or are we declaring that hopelessly
wrong and require to go through the function in vdso32? Linus?

As it is, I don't see any cheap ways to deal with restarts if that thing
has to be preserved. For sysenter it's flatly prohibited and that allows
us to play such games with adjusted return address. Here, OTOH...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/