Re: [RESEND RFC PATCH v2] arm64: Exposes support for 32-bit syscalls

From: Amanieu d'Antras
Date: Fri Feb 12 2021 - 14:09:09 EST


On Fri, Feb 12, 2021 at 6:04 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > The patch proposed by
> > Ryan is based on the kernel patch used by Tango which can be found
> > here: https://github.com/Amanieu/linux/tree/tango-v5.4
> >
> > Efficiency is not the concern here: copying/rearranging some bytes is
> > tiny compared to the cost of a syscall. The main concern is
> > correctness: there are many cases where userspace does not have the
> > information or the capabilities needed to ensure that the 32-bit
> > syscall ABI is correctly emulated.
>
> I do appreciate that today there are cases where the emulator cannot do
> the right thing due to lack of a mechanism, but where the emulator does
> not have knowledge, I don't think that it can safely invoke the syscall.
> Consider if userspace invokes compat_rt_sigreturn() or similar, which
> will trash the emulator's state.
>
> Note that there are cases (e.g. compat_rt_sigreturn()), the kernel
> cannot provide the correct behaviour either. In your tree above, I spot
> at least the following:
>
> * For rt_sigreturn() the kernel will attempt to validate/restore a
> compat sigframe assuming the AArch32 register mappings, then
> valid_user_regs() will blat the PSTATE/SPSR_ELx value to /some/ valid
> AArch64 SPSR_ELx value that happens to mostly alias.
>
> I hope that your emulator doesn't allow emulated apps to call this,
> because it would blat the emulator's state. Regardless, this syscall
> simply cannot do any correct thing in the context of a fake compat
> task, and doesn't make sense to expose.

sigreturn and sigaction and completely emulated by Tango and never
reach the kernel. It simply doesn't make sense to do otherwise since
the kernel has no knowledge of how Tango manages the emulated AArch32
state.

I agree that invoking compat_sys_rt_sigreturn from a 64-bit process
doesn't make sense. My reading of the code is that it will trigger a
SIGSEGV due to valid_user_regs failing. We could add an explicit check
against is_aarch32_compat_task in the compat syscall but the end
result will be the same.

> * For ptrace, operations on the user_aarch32_view (which
> task_user_regset_view() will return for fake compat tasks) will
> erroneously try to convert between SPSR_ELx <-> AARCH32 SPSR layouts,
> assuming the pt_regs::pstate field is using the encoding for AArch32
> EL0, where it's actually the AArch64 EL0 encoding where the layout is
> subtly differnt. Subseqeuntly valid_user_regs() will sanitize that to
> an AArch64 encoding, with the exact same problems as for rt_sigreturn().

Note that user_aarch32_view is only returned when the tracer is
performing a compat syscall. This is no different from a normal 32-bit
process tracing a 64-bit process, which already "works" (the register
state is garbage but everything else works).

Tango doesn't use the regsets and instead retrieves the AArch32
register state directly from the traced process (which is also running
under Tango) with process_vm_readv and returns that when emulating the
various PTRACE_* operations.

> Note that attempting to set the TLS will clobber TPIDRR0_EL0, which
> the kernel will clobber for AArch64 tasks (including your fake compat
> tasks) in the KPTI trampoline. I'm not sure what your emulator expects
> here, and I suspect this also gets clobbered by the case
> tls_thread_flush() tries to cater for.

All the code paths that modify TPIDRR0_EL0 are guarded by
is_aarch32_compat_task which returns false for fake compat tasks. The
call to compat_arm_syscall which handles __ARM_NR_compat_set_tls is
also guarded by is_aarch32_compat_task.

Again, __ARM_NR_compat_set_tls is emulated internally by Tango and the
AArch32 TLS registers visible to translated code are also emulated in
software.

> So this really doesn't make sense to expose either, the kernel cannot
> possibly do something that is correct in this case.
>
> I fully expect that there are more cases where this sort of mismatch
> exists and there isn't some final sanity check that prevents said
> mismatch from breaking the kernel.
>
> Maybe your emulator avoids these, but that's no justification for the
> kernel to expose such broken behaviour.

I disagree that any broken behavior is exposed here.