Re: [PATCHv5, REBASED 9/9] x86/mm: Allow to have userspace mappings above 47-bits

From: Kirill A. Shutemov
Date: Thu May 18 2017 - 11:41:46 EST


On Thu, May 18, 2017 at 05:27:36PM +0200, Michal Hocko wrote:
> On Thu 18-05-17 18:19:52, Kirill A. Shutemov wrote:
> > On Thu, May 18, 2017 at 01:43:59PM +0200, Michal Hocko wrote:
> > > On Mon 15-05-17 15:12:18, Kirill A. Shutemov wrote:
> > > [...]
> > > > @@ -195,6 +207,16 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
> > > > info.length = len;
> > > > info.low_limit = PAGE_SIZE;
> > > > info.high_limit = get_mmap_base(0);
> > > > +
> > > > + /*
> > > > + * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area
> > > > + * in the full address space.
> > > > + *
> > > > + * !in_compat_syscall() check to avoid high addresses for x32.
> > > > + */
> > > > + if (addr > DEFAULT_MAP_WINDOW && !in_compat_syscall())
> > > > + info.high_limit += TASK_SIZE_MAX - DEFAULT_MAP_WINDOW;
> > > > +
> > > > info.align_mask = 0;
> > > > info.align_offset = pgoff << PAGE_SHIFT;
> > > > if (filp) {
> > >
> > > I have two questions/concerns here. The above assumes that any address above
> > > 1<<47 will use the _whole_ address space. Is this what we want?
> >
> > Yes, I believe so.
> >
> > > What if somebody does mmap(1<<52, ...) because he wants to (ab)use 53+
> > > bits for some other purpose? Shouldn't we cap the high_limit by the
> > > given address?
> >
> > This would screw existing semantics of hint address -- "map here if
> > free, please".
>
> Well, the given address is just _hint_. We are still allowed to map to a
> different place. And it is not specified whether the resulting mapping
> is above or below that address. So I do not think it would screw the
> existing semantic. Or do I miss something?

You are right, that this behaviour is not fixed by any standard or written
down in documentation, but it's de-facto policy of Linux mmap(2) the
beginning.

And we need to be very careful when messing with this.

I believe that qemu linux-user to some extend relies on this behaviour to
do 32-bit allocations on 64-bit machine.

https://github.com/qemu/qemu/blob/master/linux-user/mmap.c#L256

--
Kirill A. Shutemov