[RESEND PATCH v4 8/8] arm64: Allow 64-bit tasks to invoke compat syscalls

From: Amanieu d'Antras
Date: Tue May 18 2021 - 05:08:16 EST


Setting bit 31 in x8 when performing a syscall will do the following:
- The remainder of x8 is treated as a compat syscall number and is used
to index the compat syscall table.
- in_compat_syscall will return true for the duration of the syscall.
- VM allocations performed by the syscall will be located in the lower
4G of the address space.
- Interrupted syscalls are properly restarted as compat syscalls.
- Seccomp will treats the syscall as having AUDIT_ARCH_ARM instead of
AUDIT_ARCH_AARCH64. This affects the arch value seen by seccomp
filters and reported by SIGSYS.
- PTRACE_GET_SYSCALL_INFO also treats the syscall as having
AUDIT_ARCH_ARM. Recent versions of strace will correctly report the
system call name and parameters when an AArch64 task mixes 32-bit and
64-bit syscalls.

Previously, setting bit 31 of the syscall number would always cause the
sygscall to return ENOSYS. This allows user programs to reliably detect
kernel support for compat syscall by trying a simple syscall such as
getpid.

The AArch32-private compat syscalls (__ARM_NR_compat_*) are not exposed
through this interface. These syscalls do not make sense in the context
of an AArch64 task.

Signed-off-by: Amanieu d'Antras <amanieu@xxxxxxxxx>
Co-developed-by: Ryan Houdek <Houdek.Ryan@xxxxxxxxxxx>
Signed-off-by: Ryan Houdek <Houdek.Ryan@xxxxxxxxxxx>
---
arch/arm64/include/uapi/asm/unistd.h | 2 ++
arch/arm64/kernel/signal.c | 5 +++++
arch/arm64/kernel/syscall.c | 21 ++++++++++++++++++++-
3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/uapi/asm/unistd.h b/arch/arm64/include/uapi/asm/unistd.h
index f83a70e07df8..5574bc6ab0a3 100644
--- a/arch/arm64/include/uapi/asm/unistd.h
+++ b/arch/arm64/include/uapi/asm/unistd.h
@@ -15,6 +15,8 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

+#define __ARM64_COMPAT_SYSCALL_BIT 0x80000000
+
#define __ARCH_WANT_RENAMEAT
#define __ARCH_WANT_NEW_STAT
#define __ARCH_WANT_SET_GET_RLIMIT
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 6237486ff6bb..463c8a82050e 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -795,6 +795,11 @@ static void setup_restart_syscall(struct pt_regs *regs)
{
if (is_compat_task())
compat_setup_restart_syscall(regs);
+#ifdef COMPAT
+ else if (in_compat_syscall())
+ regs->regs[8] = __ARM64_COMPAT_SYSCALL_BIT |
+ __NR_compat_restart_syscall;
+#endif
else
regs->regs[8] = __NR_restart_syscall;
}
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index e0e9d54de0a2..83747cf4b5b7 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -118,6 +118,11 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
* user-issued syscall(-1). However, requesting a skip and not
* setting the return value is unlikely to do anything sensible
* anyway.
+ *
+ * This edge case goes away with CONFIG_COMPAT since a
+ * user-issued syscall(-1) is interpreted as a
+ * compat_syscall(0x7fffffff) which still ends up returning
+ * -ENOSYS in x0.
*/
if (scno == NO_SYSCALL)
regs->regs[0] = -ENOSYS;
@@ -165,7 +170,21 @@ static inline void sve_user_discard(void)
void do_el0_svc(struct pt_regs *regs)
{
sve_user_discard();
- el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table);
+
+#ifdef CONFIG_COMPAT
+ /*
+ * Setting bit 31 of x8 allows a 64-bit processe to perform compat
+ * syscalls.
+ */
+ if (regs->regs[8] & __ARM64_COMPAT_SYSCALL_BIT) {
+ current_thread_info()->use_compat_syscall = true;
+ el0_svc_common(regs,
+ regs->regs[8] & ~__ARM64_COMPAT_SYSCALL_BIT,
+ __NR_compat_syscalls, compat_sys_call_table);
+ current_thread_info()->use_compat_syscall = false;
+ } else
+#endif
+ el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table);
}

#ifdef CONFIG_COMPAT
--
2.31.1