Re: [PATCHv3 33/33] mm, x86: introduce PR_SET_MAX_VADDR and PR_GET_MAX_VADDR

From: Michael Pratt
Date: Tue Feb 21 2017 - 01:11:41 EST


Sigh... apologies for the HTML. Trying again...

On Mon, Feb 20, 2017 at 9:21 PM, Michael Pratt <linux@xxxxxxxx> wrote:
> On Fri, Feb 17, 2017 at 3:02 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> On Fri, Feb 17, 2017 at 1:01 PM, Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>> On Fri, Feb 17, 2017 at 12:12 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>>>
>>>> At the very least, I'd want to see
>>>> MAP_FIXED_BUT_DONT_BLOODY_UNMAP_ANYTHING. I *hate* the current
>>>> interface.
>>>
>>> That's unrelated, but I guess w could add a MAP_NOUNMAP flag, and then
>>> you can use MAP_FIXED | MAP_NOUNMAP or something.
>>>
>>> But that has nothing to do with the 47-vs-56 bit issue.
>>>
>>>> How about MAP_LIMIT where the address passed in is interpreted as an
>>>> upper bound instead of a fixed address?
>>>
>>> Again, that's a unrelated semantic issue. Right now - if you don't
>>> pass in MAP_FIXED at all, the "addr" argument is used as a starting
>>> value for deciding where to find an unmapped area. But there is no way
>>> to specify the end. That would basically be what the process control
>>> thing would be (not per-system-call, but per-thread ).
>>>
>>
>> What I'm trying to say is: if we're going to do the route of 48-bit
>> limit unless a specific mmap call requests otherwise, can we at least
>> have an interface that doesn't suck?

I've got a set of patches that I've meant to send out as an RFC for a
while that tries to address userspace control of address space layout
and covers many of these ideas.

There is a new syscall and set of prctls for controlling the "mmap
layout" (i.e., get_unmapped_area search range) that look something
like this:

struct mmap_layout {
unsigned long start;
unsigned long end;
/*
* These are equivalent to mmap_legacy_base and mmap_base,
* but are not really needed in this proposal.
*/
unsigned long low_base;
unsigned long high_base;
unsigned long flags;
};

/* For flags */
#define MMAP_TOPDOWN 1

struct layout_mmap_args {
unsigned long addr;
unsigned long len;
unsigned long prot;
unsigned long flags;
unsigned long fd;
unsigned long off;
struct mmap_layout layout;
};

void *layout_mmap(struct layout_mmap_args *args);

int prctl(PR_GET_MMAP_LAYOUT, struct mmap_layout *layout);
int prctl(PR_SET_MMAP_LAYOUT, struct mmap_layout *layout);

The prctls control the default range that mmap and friends will
allocate. For 56-bit user address space, it could default to
[mmap_min_addr, 1<<47), as Linus suggests. Applications that want the
full address space can increase it to cover the entire range.

The layout_mmap syscall allows one-off mappings that fall outside the
default layout, and nicely solves the "MAP_FIXED but don't unmap
anything problem" by passing an explicit range to check without
actually setting MAP_FIXED.

This idea is quite similar to the MAX_VADDR + default
get_unmapped_area behavior ides, just more generalized to give
userspace more control over the ultimate behavior of
get_unmapped_area.


PS. Apologies if my email client screwed up this message. I didn't
have this thread in my client and have tried to import it from another
account.