Re: MREMAP_DONTUNMAP corrupts initial mapping

From: Lorenzo Stoakes
Date: Thu Mar 30 2023 - 14:43:03 EST


On Thu, Mar 30, 2023 at 07:45:14PM +0500, stsp wrote:
> Add a few CCs.
>
> 30.03.2023 17:38, stsp пишет:
> > Hello.
> >
> > Attached is a small test-case that
> > demonstrates the problem.
> > The problem happens if you change
> > some data in a file-backed private
> > mapping and then use mremap on
> > it with MREMAP_DONTUNMAP flag.
> > The result is:
> > - destination copy is valid
> > - source copy restored from the original file
> >
> > So the 2 copies do not match.

This seems to be a case of the documentation not quite being correct in the
case of a MAP_PRIVATE file mapping, from the mremap man page discussing
MREMAP_DONTUNMAP:-

After completion, any access to the range specified by old_address and
old_size will result in a page fault. The page fault will be han‐ dled
by a userfaultfd(2) handler if the address is in a range previously
registered with userfaultfd(2). Otherwise, the kernel allocates a
zero-filled page to handle the fault.

This documents what happens with the combination of MREMAP_DONTUNMAP and
_anonymous_ mappings. This is accurate in the anonymous mapping case
because after move_page_tables() the VMA remains the same but accesses
cause page faults which will map the zero page.

However, MAP_PRIVATE file-backed mappings have different semantics - if the
page table mappings are invalidated in any way (typically due to file
truncation) then, on fault, the mappings revert to a CoW of the page cache
entries, which is exactly what is happening here.

I think this is probably the behaviour you want because fundamentally the
VMAs in both cases map a file and these are the semantics associated with a
MAP_PRIVATE file mapping. You'd otherwise have to either change the
original VMA to be a wholly anonymous mapping (which would cause surprising
behaviour on truncation) or you'd have to explicitly zero the memory and
CoW it in which doesn't really sound appealing either.

Overall this strikes me as a problem with the documentation being a bit
outdated since MREMAP_DONTUNMAP was extended to non-anonymous mappings [1]
and ultimately needs a slightly tweaked explanation to cover this case.

CC-ing Michael, manpages/api lists accordingly.

[1]:https://lore.kernel.org/all/20210323182520.2712101-1-bgeffon@xxxxxxxxxx/