Re: [PATCHv3] x86/mm: set x32 syscall bit in SET_PERSONALITY()

From: Adam Borowski
Date: Tue Mar 21 2017 - 17:17:02 EST


On Tue, Mar 21, 2017 at 08:47:11PM +0300, Dmitry Safonov wrote:
> After my changes to mmap(), its code now relies on the bitness of
> performing syscall. According to that, it chooses the base of allocation:
> mmap_base for 64-bit mmap() and mmap_compat_base for 32-bit syscall.
> It was done by:
> commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
> 32-bit mmap()").
>
> The code afterwards relies on in_compat_syscall() returning true for
> 32-bit syscalls. It's usually so while we're in context of application
> that does 32-bit syscalls. But during exec() it is not valid for x32 ELF.
> The reason is that the application hasn't yet done any syscall, so x32
> bit has not being set.
> That results in -ENOMEM for x32 ELF files as there fired BAD_ADDR()
> in elf_map(), that is called from do_execve()->load_elf_binary().
> For i386 ELFs it works as SET_PERSONALITY() sets TS_COMPAT flag.
>
> Set x32 bit before first return to userspace, during setting personality
> at exec(). This way we can rely on in_compat_syscall() during exec().
> Do also the reverse: drop x32 syscall bit at SET_PERSONALITY for 64-bits.
>
> Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for
> 32-bit mmap()")

Tested:
with bash:x32, mksh:amd64, posh:i386, zsh:armhf (binfmt:qemu), fork+exec
works for every parent-child combination.

Contrary to my naive initial reading of your fix, mixing syscalls from a
process of the wrong ABI also works as it did before. While using a glibc
wrapper will call the right version, x32 processes calling amd64 syscalls is
surprisingly common -- this brings seccomp joy.

I've attached a freestanding test case for write() and mmap(); it's
freestanding asm as most of you don't have an x32 toolchain at hand, sorry
for unfriendly error messages.

So with these two patches:
x86/tls: Forcibly set the accessed bit in TLS segments
x86/mm: set x32 syscall bit in SET_PERSONALITY()
everything appears to be fine.

--
âââââââ Meow!
âââââââ
âââââââ Collisions shmolisions, let's see them find a collision or second
âââââââ preimage for double rot13!
.globl _start
.data
msg: .ascii "Meow!\n"
badmsg: .ascii "syscall failed\n"
.text
_start:
# x32
mov $0x40000001, %rax # syscall: write
mov $1, %rdi
mov $msg, %rsi
mov $6, %rdx
syscall

# amd64
mov $1, %rax # syscall: write
mov $1, %rdi
mov $msg, %rsi
mov $6, %rdx
syscall

# i386
mov $4, %eax # syscall: write
mov $1, %ebx
mov $msg, %ecx
mov $6, %edx
int $0x80


# x32
mov $0x40000009, %rax # syscall: mmap
mov $0, %rdi
mov $0x10000, %rsi
mov $3, %rdx # PROT_READ|PROT_WRITE
mov $0x62, %r10 # MAP_PRIVATE|MAP_ANON|MAP_32BIT
mov $-1, %r8
mov $0, %r9
syscall
or %rax, %rax
js badness

# amd64
mov $0x9, %rax # syscall: mmap
mov $0, %rdi
mov $0x10000, %rsi
mov $3, %rdx # PROT_READ|PROT_WRITE
mov $0x62, %r10 # MAP_PRIVATE|MAP_ANON|MAP_32BIT
mov $-1, %r8
mov $0, %r9
syscall
or %rax, %rax
js badness

jmp goodbye # m'kay, this one doesn't work, no regression
# i386
mov $0x90, %eax # syscall: mmap
mov $0, %ebx
mov $0x10000, %ecx
mov $3, %edx # PROT_READ|PROT_WRITE
mov $0x62, %esi # MAP_PRIVATE|MAP_ANON|MAP_32BIT
mov $-1, %edi
mov $0, %ebp
int $0x80
movslq %eax, %rax
or %rax, %rax
js badness

goodbye:
mov $0x4000003c, %rax # syscall: _exit
xor %rdi, %rdi
syscall

badness:
# I'm too lazy to printf this as a number...
push %rax
mov $0x40000001, %rax # syscall: write
mov $1, %rdi
mov $badmsg, %rsi
mov $15, %rdx
syscall

mov $0x4000003c, %rax # syscall: _exit
pop %rdi
syscall
# Any of amd64/x32/i386 will do.
X86=x86_64-linux-gnu

all: meow-x32 meow-amd64
clean:
rm -f meow-*

meow-x32: meow.s
$(X86)-as --x32 $^ -o $@.o
$(X86)-ld -melf32_x86_64 -s $@.o -o $@

meow-amd64: meow.s
$(X86)-as --64 $^ -o $@.o
$(X86)-ld -melf_x86_64 -s $@.o -o $@