Re: [PATCH v3 04/20] mm: VMA sequence count

From: Sergey Senozhatsky
Date: Wed Sep 13 2017 - 07:54:04 EST


Hi,

On (09/08/17 20:06), Laurent Dufour wrote:
[..]
> @@ -903,6 +910,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
> mm->map_count--;
> mpol_put(vma_policy(next));
> kmem_cache_free(vm_area_cachep, next);
> + write_seqcount_end(&next->vm_sequence);
> /*
> * In mprotect's case 6 (see comments on vma_merge),
> * we must remove another next too. It would clutter
> @@ -932,11 +940,14 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
> if (remove_next == 2) {
> remove_next = 1;
> end = next->vm_end;
> + write_seqcount_end(&vma->vm_sequence);
> goto again;
> - }
> - else if (next)
> + } else if (next) {
> + if (next != vma)
> + write_seqcount_begin_nested(&next->vm_sequence,
> + SINGLE_DEPTH_NESTING);
> vma_gap_update(next);
> - else {
> + } else {
> /*
> * If remove_next == 2 we obviously can't
> * reach this path.
> @@ -962,6 +973,10 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
> if (insert && file)
> uprobe_mmap(insert);
>
> + if (next && next != vma)
> + write_seqcount_end(&next->vm_sequence);
> + write_seqcount_end(&vma->vm_sequence);


ok, so what I got on my box is:

vm_munmap() -> down_write_killable(&mm->mmap_sem)
do_munmap()
__split_vma()
__vma_adjust() -> write_seqcount_begin(&vma->vm_sequence)
-> write_seqcount_begin_nested(&next->vm_sequence, SINGLE_DEPTH_NESTING)

so this gives 3 dependencies ->mmap_sem -> ->vm_seq
->vm_seq -> ->vm_seq/1
->mmap_sem -> ->vm_seq/1


SyS_mremap() -> down_write_killable(&current->mm->mmap_sem)
move_vma() -> write_seqcount_begin(&vma->vm_sequence)
-> write_seqcount_begin_nested(&new_vma->vm_sequence, SINGLE_DEPTH_NESTING);
move_page_tables()
__pte_alloc()
pte_alloc_one()
__alloc_pages_nodemask()
fs_reclaim_acquire()


I think here we have prepare_alloc_pages() call, that does

-> fs_reclaim_acquire(gfp_mask)
-> fs_reclaim_release(gfp_mask)

so that adds one more dependency ->mmap_sem -> ->vm_seq -> fs_reclaim
->mmap_sem -> ->vm_seq/1 -> fs_reclaim


now, under memory pressure we hit the slow path and perform direct
reclaim. direct reclaim is done under fs_reclaim lock, so we end up
with the following call chain

__alloc_pages_nodemask()
__alloc_pages_slowpath()
__perform_reclaim() -> fs_reclaim_acquire(gfp_mask);
try_to_free_pages()
shrink_node()
shrink_active_list()
rmap_walk_file() -> i_mmap_lock_read(mapping);


and this break the existing dependency. since we now take the leaf lock
(fs_reclaim) first and the the root lock (->mmap_sem).


well, seems to be the case.

-ss