Re: [RESEND PATCH v4 8/8] arm64: Allow 64-bit tasks to invoke compat syscalls
From: Steven Price
Date: Fri May 21 2021 - 04:51:08 EST
On 19/05/2021 17:14, Amanieu d'Antras wrote:
> On Wed, May 19, 2021 at 4:30 PM Steven Price <steven.price@xxxxxxx> wrote:
>> Perhaps I'm missing something, but surely some syscalls that would be
>> native on 32 bit will have to be translated by Tango to 64 bit syscalls
>> to do the right thing? E.g. from the previous patch compat sigreturn
>> isn't available.
>
> That's correct.
>
> Tango handles syscalls in 3 different ways:
> - ~20 syscalls are completely emulated in userspace or through 64-bit
> syscalls. E.g. sigaction, sigreturn, clone, exit.
> - Another ~50 syscalls have various forms of pre/post-processing, but
> are otherwise passed on to the kernel compat syscall handler. E.g.
> open, mmap, ptrace.
> - The remaining syscalls are passed on to the kernel compat syscall
> handler directly.
>
> The first group of ~20 syscalls will effectively bypass the
> user-specified seccomp filter: any 64-bit syscalls used to emulate
> them will be whitelisted. I consider this an acceptable limitation to
> Tango's seccomp support since I see no viable way of supporting
> seccomp filtering for these syscalls.
I agree it's difficult - the only 'solution' I can see is like I said to
emulate the BPF code in user space.
>> In those cases to correctly emulate seccomp, isn't Tango is going to
>> have to implement the seccomp filter in user space?
>
> I have not implemented user-mode seccomp emulation because it can
> trivially be bypassed by spawning a 64-bit child process which runs
> outside Tango. Even when spawning another translated process, the
> user-mode filter will not be preserved across an execve.
Clearly if you have user-mode seccomp emulation then you'd hook execve
and either install the real BPF filter (if spawning a 64 bit child
outside Tango) or ensure that the user-mode emulation is passed on to
the child (if running within Tango).
You already have a potential 'issue' here of a 64 bit process setting up
a seccomp filter and then execve()ing a 32 bit (Tango'd) process. The
set of syscalls needed for the system which supports AArch32 natively is
going to be different from the syscalls needed for Tango. (Fundamentally
this is a major limitation with the whole seccomp syscall filtering
approach).
>> I guess the question comes down to how big a hole is
>> syscall_in_tango_whitelist() - if Tango only requires a small set of
>> syscalls then there is still some security benefit, but otherwise this
>> doesn't seem like a particularly big benefit considering you're already
>> going to need the BPF infrastructure in user space.
>
> Currently Tango only whitelists ~50 syscalls, which is small enough to
> provide security benefits and definitely better than not supporting
> seccomp at all.
Agreed, and I don't want to imply that this approach is necessarily
wrong. But given that the approach of getting the kernel to do the
compat syscall filtering is not perfect, I'm not sure in itself it's a
great justification for needing the kernel to support all the compat
syscalls.
One other thought: I suspect in practise there aren't actually many
variations in the BPF programs used with seccomp. It may well be quite
possible to convert the 32-bit syscall filtering programs to filter the
equivalent 64-bit syscalls that Tango would use. Sadly this would be
fragile if a program used a BPF program which didn't follow the "normal"
pattern.
Steve