Re: [PATCH 07/13] mm/munlock: mlock_pte_range() when mlocking or munlocking

From: Vlastimil Babka
Date: Fri Feb 11 2022 - 11:45:34 EST


On 2/6/22 22:42, Hugh Dickins wrote:
> Fill in missing pieces: reimplementation of munlock_vma_pages_range(),
> required to lower the mlock_counts when munlocking without munmapping;
> and its complement, implementation of mlock_vma_pages_range(), required
> to raise the mlock_counts on pages already there when a range is mlocked.
>
> Combine them into just the one function mlock_vma_pages_range(), using
> walk_page_range() to run mlock_pte_range(). This approach fixes the
> "Very slow unlockall()" of unpopulated PROT_NONE areas, reported in
> https://lore.kernel.org/linux-mm/70885d37-62b7-748b-29df-9e94f3291736@xxxxxxxxx/
>
> Munlock clears VM_LOCKED at the start, under exclusive mmap_lock; but if
> a racing truncate or holepunch (depending on i_mmap_rwsem) gets to the
> pte first, it will not try to munlock the page: leaving release_pages()
> to correct it when the last reference to the page is gone - that's okay,
> a page is not evictable anyway while it is held by an extra reference.
>
> Mlock sets VM_LOCKED at the start, under exclusive mmap_lock; but if
> a racing remove_migration_pte() or try_to_unmap_one() (depending on
> i_mmap_rwsem) gets to the pte first, it will try to mlock the page,
> then mlock_pte_range() mlock it a second time. This is harder to
> reproduce, but a more serious race because it could leave the page
> unevictable indefinitely though the area is munlocked afterwards.
> Guard against it by setting the (inappropriate) VM_IO flag,
> and modifying mlock_vma_page() to decline such vmas.
>
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

> @@ -162,8 +230,7 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev,
> pgoff_t pgoff;
> int nr_pages;
> int ret = 0;
> - int lock = !!(newflags & VM_LOCKED);
> - vm_flags_t old_flags = vma->vm_flags;
> + vm_flags_t oldflags = vma->vm_flags;
>
> if (newflags == vma->vm_flags || (vma->vm_flags & VM_SPECIAL) ||

Nit: can use oldflags instead of vma->vm_flags above?