Re: [PATCH 1/2] x86/mm: don't mmap() over 4GB with compat syscall

From: Dmitry Safonov
Date: Thu Jan 12 2017 - 06:43:54 EST


On 01/12/2017 01:26 AM, Andy Lutomirski wrote:
On Wed, Jan 11, 2017 at 10:17 AM, Dmitry Safonov <dsafonov@xxxxxxxxxxxxx> wrote:
During fixing CRIU bugs on ZDTM tests for 32-bit C/R, I found that
compatible ia32/x32 syscalls mmap() and mmap2() can return address
over 4Gb in x86_64 applications, which results in returning lower
4 bytes of address while dropping the higher bytes.
It happens because mmap() upper limit doesn't differ native/compat
syscalls for 64-bit task, it's: (TASK_UNMAPPED_BASE + random_factor)
which is: (PAGE_ALIGN(TASK_SIZE / 3)) + random_factor
(in case of legacy mmap it's just TASK_SIZE).
This patch limits higher address that can be mmaped with compat
syscalls in 64-bit applications with IA32_PAGE_OFFSET (+randomization).

Signed-off-by: Dmitry Safonov <dsafonov@xxxxxxxxxxxxx>
---
arch/x86/kernel/sys_x86_64.c | 37 +++++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index a55ed63b9f91..0893725db6e6 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -100,7 +100,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
static void find_start_end(unsigned long flags, unsigned long *begin,
unsigned long *end)
{
- if (!test_thread_flag(TIF_ADDR32) && (flags & MAP_32BIT)) {
+ if (!test_thread_flag(TIF_ADDR32)) {
/* This is usually used needed to map code in small
model, so it needs to be in the first 31bit. Limit
it to that. This means we need to move the
@@ -109,14 +109,24 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
malloc knows how to fall back to mmap. Give it 1GB
of playground for now. -AK */
*begin = 0x40000000;
- *end = 0x80000000;
- if (current->flags & PF_RANDOMIZE) {
- *begin = randomize_page(*begin, 0x02000000);
+
+ if (flags & MAP_32BIT) {
+ if (current->flags & PF_RANDOMIZE)
+ *begin = randomize_page(*begin, 0x02000000);
+ *end = 0x80000000;
+ return;
+ }
+ if (current->thread.status & TS_COMPAT) {
+ if (current->flags & PF_RANDOMIZE)
+ *begin = randomize_page(*begin,
+ 1UL << mmap_rnd_compat_bits);
+ *end = IA32_PAGE_OFFSET;
+ return;
}
- } else {
- *begin = current->mm->mmap_legacy_base;
- *end = TASK_SIZE;
}
+
+ *begin = current->mm->mmap_legacy_base;
+ *end = TASK_SIZE;
}

unsigned long
@@ -187,10 +197,21 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
return addr;
}

+ if (current->thread.status & TS_COMPAT) {

in_compat_syscall(), please.

Also, we need to verify that, if this is called execve(), it does the
right thing.

It definitely does not right things at this moment.
Will fix to v2.


+ if (current->flags & PF_RANDOMIZE) {
+ unsigned long rnd = 1UL << mmap_rnd_compat_bits;
+
+ info.high_limit =
+ randomize_page(IA32_PAGE_OFFSET - rnd, rnd);
+ } else {
+ info.high_limit = IA32_PAGE_OFFSET;
+ }
+ } else {
+ info.high_limit = mm->mmap_base;
+ }

This code was incomprehensible before and it's worse now. Could you
try to clean it up a bit? For example, a patch that simply folds
find_start_end() into its sole caller as the first patch in the series
without changing any semantics would probably help.

--Andy



--
Dmitry