Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR
From: Kirill A. Shutemov
Date: Mon Jan 02 2017 - 04:10:03 EST
On Mon, Dec 26, 2016 at 07:22:03PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 26, 2016 at 6:24 PM, Kirill A. Shutemov
> <kirill@xxxxxxxxxxxxx> wrote:
> > On Mon, Dec 26, 2016 at 06:06:01PM -0800, Andy Lutomirski wrote:
> >> On Mon, Dec 26, 2016 at 5:54 PM, Kirill A. Shutemov
> >> <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
> >> > This patch introduces new rlimit resource to manage maximum virtual
> >> > address available to userspace to map.
> >> >
> >> > On x86, 5-level paging enables 56-bit userspace virtual address space.
> >> > Not all user space is ready to handle wide addresses. It's known that
> >> > at least some JIT compilers use high bit in pointers to encode their
> >> > information. It collides with valid pointers with 5-level paging and
> >> > leads to crashes.
> >> >
> >> > The patch aims to address this compatibility issue.
> >> >
> >> > MM would use min(RLIMIT_VADDR, TASK_SIZE) as upper limit of virtual
> >> > address available to map by userspace.
> >> >
> >> > The default hard limit will be RLIM_INFINITY, which basically means that
> >> > TASK_SIZE limits available address space.
> >> >
> >> > The soft limit will also be RLIM_INFINITY everywhere, but the machine
> >> > with 5-level paging enabled. In this case, soft limit would be
> >> > (1UL << 47) - PAGE_SIZE. Itâs current x86-64 TASK_SIZE_MAX with 4-level
> >> > paging which known to be safe
> >> >
> >> > New rlimit resource would follow usual semantics with regards to
> >> > inheritance: preserved on fork(2) and exec(2). This has potential to
> >> > break application if limits set too wide or too narrow, but this is not
> >> > uncommon for other resources (consider RLIMIT_DATA or RLIMIT_AS).
> >> >
> >> > As with other resources you can set the limit lower than current usage.
> >> > It would affect only future virtual address space allocations.
> >> >
> >> > Use-cases for new rlimit:
> >> >
> >> > - Bumping the soft limit to RLIM_INFINITY, allows current process all
> >> > its children to use addresses above 47-bits.
> >> >
> >> > - Bumping the soft limit to RLIM_INFINITY after fork(2), but before
> >> > exec(2) allows the child to use addresses above 47-bits.
> >> >
> >> > - Lowering the hard limit to 47-bits would prevent current process all
> >> > its children to use addresses above 47-bits, unless a process has
> >> > CAP_SYS_RESOURCES.
> >> >
> >> > - Itâs also can be handy to lower hard or soft limit to arbitrary
> >> > address. User-mode emulation in QEMU may lower the limit to 32-bit
> >> > to emulate 32-bit machine on 64-bit host.
> >>
> >> I tend to think that this should be a personality or an ELF flag, not
> >> an rlimit.
> >
> > My plan was to implement ELF flag on top. Basically, ELF flag would mean
> > that we bump soft limit to hard limit on exec.
> >
> >> That way setuid works right.
> >
> > Um.. I probably miss background here.
> >
>
> If a setuid program depends on the lower limit, then a malicious
> program shouldn't be able to cause it to run with the higher limit.
> The personality code should already get this case right because
> personalities are reset when setuid happens.
It would be nice to have more fine-grained control than binary personality
flag gives. It would cover more use-cases.
Well, we could reset the limit on exec of setuid binary too. That's not
ideal, but...
--
Kirill A. Shutemov