Re: [PATCH] mmap.2: MAP_FIXED is okay if the address range has been reserved

From: Jann Horn
Date: Mon Apr 16 2018 - 09:56:18 EST


On Mon, Apr 16, 2018 at 12:07 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> On Fri 13-04-18 18:17:36, Jann Horn wrote:
>> On Fri, Apr 13, 2018 at 6:05 PM, Jann Horn <jannh@xxxxxxxxxx> wrote:
>> > On Fri, Apr 13, 2018 at 6:04 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>> >> On Fri 13-04-18 17:04:09, Jann Horn wrote:
>> >>> On Fri, Apr 13, 2018 at 8:49 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>> >>> > On Fri 13-04-18 08:43:27, Michael Kerrisk wrote:
>> >>> > [...]
>> >>> >> So, you mean remove this entire paragraph:
>> >>> >>
>> >>> >> For cases in which the specified memory region has not been
>> >>> >> reserved using an existing mapping, newer kernels (Linux
>> >>> >> 4.17 and later) provide an option MAP_FIXED_NOREPLACE that
>> >>> >> should be used instead; older kernels require the caller to
>> >>> >> use addr as a hint (without MAP_FIXED) and take appropriate
>> >>> >> action if the kernel places the new mapping at a different
>> >>> >> address.
>> >>> >>
>> >>> >> It seems like some version of the first half of the paragraph is worth
>> >>> >> keeping, though, so as to point the reader in the direction of a remedy.
>> >>> >> How about replacing that text with the following:
>> >>> >>
>> >>> >> Since Linux 4.17, the MAP_FIXED_NOREPLACE flag can be used
>> >>> >> in a multithreaded program to avoid the hazard described
>> >>> >> above.
>> >>> >
>> >>> > Yes, that sounds reasonable to me.
>> >>>
>> >>> But that kind of sounds as if you can't avoid it before Linux 4.17,
>> >>> when actually, you just have to call mmap() with the address as hint,
>> >>> and if mmap() returns a different address, munmap() it and go on your
>> >>> normal error path.
>> >>
>> >> This is still racy in multithreaded application which is the main point
>> >> of the whole section, no?
>> >
>> > No, it isn't.
>
> I could have been more specific, sorry.
>
>> mmap() with a hint (without MAP_FIXED) will always non-racily allocate
>> a memory region for you or return an error code. If it does allocate a
>> memory region, it belongs to you until you deallocate it. It might be
>> at a different address than you requested -
>
> Yes, this all is true. Except the atomicity is guaranteed only for the
> syscall. Once you return to the userspace any error handling is error
> prone and racy because your mapping might change under you feet. So...

Can you please elaborate on why you think anything could change the
mapping returned by mmap() under the caller's feet?
When mmap() returns a memory area to the caller, that memory area
belongs to the caller. No unrelated code will touch it, unless that
code is buggy.

>> in that case you can
>> emulate MAP_FIXED_NOREPLACE by calling munmap() and treating it as an
>> error; or you can do something else with it.
>>
>> MAP_FIXED_NOREPLACE is just a performance optimization.
>
> This is not quite true because you get _your_ area or _an error_
> atomically which is not possible with 2 syscalls.

Why not?