Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR

From: Andy Lutomirski
Date: Fri Dec 30 2016 - 21:08:58 EST


On Wed, Dec 28, 2016 at 6:53 PM, Carlos O'Donell <carlos@xxxxxxxxxx> wrote:
> On 12/26/2016 09:24 PM, Kirill A. Shutemov wrote:
>> On Mon, Dec 26, 2016 at 06:06:01PM -0800, Andy Lutomirski wrote:
>>> On Mon, Dec 26, 2016 at 5:54 PM, Kirill A. Shutemov
>>> <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
>>>> This patch introduces new rlimit resource to manage maximum virtual
>>>> address available to userspace to map.
>>>>
>>>> On x86, 5-level paging enables 56-bit userspace virtual address space.
>>>> Not all user space is ready to handle wide addresses. It's known that
>>>> at least some JIT compilers use high bit in pointers to encode their
>>>> information. It collides with valid pointers with 5-level paging and
>>>> leads to crashes.
>>>>
>>>> The patch aims to address this compatibility issue.
>>>>
>>>> MM would use min(RLIMIT_VADDR, TASK_SIZE) as upper limit of virtual
>>>> address available to map by userspace.
>>>>
>>>> The default hard limit will be RLIM_INFINITY, which basically means that
>>>> TASK_SIZE limits available address space.
>>>>
>>>> The soft limit will also be RLIM_INFINITY everywhere, but the machine
>>>> with 5-level paging enabled. In this case, soft limit would be
>>>> (1UL << 47) - PAGE_SIZE. Itâs current x86-64 TASK_SIZE_MAX with 4-level
>>>> paging which known to be safe
>>>>
>>>> New rlimit resource would follow usual semantics with regards to
>>>> inheritance: preserved on fork(2) and exec(2). This has potential to
>>>> break application if limits set too wide or too narrow, but this is not
>>>> uncommon for other resources (consider RLIMIT_DATA or RLIMIT_AS).
>>>>
>>>> As with other resources you can set the limit lower than current usage.
>>>> It would affect only future virtual address space allocations.
>>>>
>>>> Use-cases for new rlimit:
>>>>
>>>> - Bumping the soft limit to RLIM_INFINITY, allows current process all
>>>> its children to use addresses above 47-bits.
>>>>
>>>> - Bumping the soft limit to RLIM_INFINITY after fork(2), but before
>>>> exec(2) allows the child to use addresses above 47-bits.
>>>>
>>>> - Lowering the hard limit to 47-bits would prevent current process all
>>>> its children to use addresses above 47-bits, unless a process has
>>>> CAP_SYS_RESOURCES.
>>>>
>>>> - Itâs also can be handy to lower hard or soft limit to arbitrary
>>>> address. User-mode emulation in QEMU may lower the limit to 32-bit
>>>> to emulate 32-bit machine on 64-bit host.
>>>
>>> I tend to think that this should be a personality or an ELF flag, not
>>> an rlimit.
>>
>> My plan was to implement ELF flag on top. Basically, ELF flag would mean
>> that we bump soft limit to hard limit on exec.
>
> Could you clarify what you mean by an "ELF flag?"

Some way to mark a binary as supporting a larger address space. I
don't have a precise solution in mind, but an ELF note might be a good
way to go here.

--Andy