Re: [PATCH v3 04/20] mm: VMA sequence count

From: Sergey Senozhatsky
Date: Thu Sep 14 2017 - 05:40:56 EST


On (09/14/17 11:15), Laurent Dufour wrote:
> On 14/09/2017 11:11, Sergey Senozhatsky wrote:
> > On (09/14/17 10:58), Laurent Dufour wrote:
> > [..]
> >> That's right, but here this is the sequence counter mm->mm_seq, not the
> >> vm_seq one.
> >
> > d'oh... you are right.
>
> So I'm doubting about the probability of a deadlock here, but I don't like
> to see lockdep complaining. Is there an easy way to make it happy ?


/*
* well... answering your question - it seems raw versions of seqcount
* functions don't call lockdep's lock_acquire/lock_release...
*
* but I have never told you that. never.
*/


lockdep, perhaps, can be wrong sometimes, and may be it's one of those
cases. may be not... I'm not a MM guy myself.

below is a lockdep splat I got yesterday. that's v3 of SPF patch set.


[ 2763.365898] ======================================================
[ 2763.365899] WARNING: possible circular locking dependency detected
[ 2763.365902] 4.13.0-next-20170913-dbg-00039-ge3c06ea4b028-dirty #1837 Not tainted
[ 2763.365903] ------------------------------------------------------
[ 2763.365905] khugepaged/42 is trying to acquire lock:
[ 2763.365906] (&mapping->i_mmap_rwsem){++++}, at: [<ffffffff811181cc>] rmap_walk_file+0x5a/0x142
[ 2763.365913]
but task is already holding lock:
[ 2763.365915] (fs_reclaim){+.+.}, at: [<ffffffff810e99dc>] fs_reclaim_acquire+0x12/0x35
[ 2763.365920]
which lock already depends on the new lock.

[ 2763.365922]
the existing dependency chain (in reverse order) is:
[ 2763.365924]
-> #3 (fs_reclaim){+.+.}:
[ 2763.365930] lock_acquire+0x176/0x19e
[ 2763.365932] fs_reclaim_acquire+0x32/0x35
[ 2763.365934] __alloc_pages_nodemask+0x6d/0x1f9
[ 2763.365937] pte_alloc_one+0x17/0x62
[ 2763.365940] __pte_alloc+0x1f/0x83
[ 2763.365943] move_page_tables+0x2c3/0x5a2
[ 2763.365944] move_vma.isra.25+0xff/0x29f
[ 2763.365946] SyS_mremap+0x41b/0x49e
[ 2763.365949] entry_SYSCALL_64_fastpath+0x18/0xad
[ 2763.365951]
-> #2 (&vma->vm_sequence/1){+.+.}:
[ 2763.365955] lock_acquire+0x176/0x19e
[ 2763.365958] write_seqcount_begin_nested+0x1b/0x1d
[ 2763.365959] __vma_adjust+0x1c4/0x5f1
[ 2763.365961] __split_vma+0x12c/0x181
[ 2763.365963] do_munmap+0x128/0x2af
[ 2763.365965] vm_munmap+0x5a/0x73
[ 2763.365968] elf_map+0xb1/0xce
[ 2763.365970] load_elf_binary+0x91e/0x137a
[ 2763.365973] search_binary_handler+0x70/0x1f3
[ 2763.365974] do_execveat_common+0x45e/0x68e
[ 2763.365978] call_usermodehelper_exec_async+0xf7/0x11f
[ 2763.365980] ret_from_fork+0x27/0x40
[ 2763.365981]
-> #1 (&vma->vm_sequence){+.+.}:
[ 2763.365985] lock_acquire+0x176/0x19e
[ 2763.365987] write_seqcount_begin_nested+0x1b/0x1d
[ 2763.365989] __vma_adjust+0x1a9/0x5f1
[ 2763.365991] __split_vma+0x12c/0x181
[ 2763.365993] do_munmap+0x128/0x2af
[ 2763.365994] vm_munmap+0x5a/0x73
[ 2763.365996] elf_map+0xb1/0xce
[ 2763.365998] load_elf_binary+0x91e/0x137a
[ 2763.365999] search_binary_handler+0x70/0x1f3
[ 2763.366001] do_execveat_common+0x45e/0x68e
[ 2763.366003] call_usermodehelper_exec_async+0xf7/0x11f
[ 2763.366005] ret_from_fork+0x27/0x40
[ 2763.366006]
-> #0 (&mapping->i_mmap_rwsem){++++}:
[ 2763.366010] __lock_acquire+0xa72/0xca0
[ 2763.366012] lock_acquire+0x176/0x19e
[ 2763.366015] down_read+0x3b/0x55
[ 2763.366017] rmap_walk_file+0x5a/0x142
[ 2763.366018] page_referenced+0xfc/0x134
[ 2763.366022] shrink_active_list+0x1ac/0x37d
[ 2763.366024] shrink_node_memcg.constprop.72+0x3ca/0x567
[ 2763.366026] shrink_node+0x3f/0x14c
[ 2763.366028] try_to_free_pages+0x288/0x47a
[ 2763.366030] __alloc_pages_slowpath+0x3a7/0xa49
[ 2763.366032] __alloc_pages_nodemask+0xf1/0x1f9
[ 2763.366035] khugepaged+0xc8/0x167c
[ 2763.366037] kthread+0x133/0x13b
[ 2763.366039] ret_from_fork+0x27/0x40
[ 2763.366040]
other info that might help us debug this:

[ 2763.366042] Chain exists of:
&mapping->i_mmap_rwsem --> &vma->vm_sequence/1 --> fs_reclaim

[ 2763.366048] Possible unsafe locking scenario:

[ 2763.366049] CPU0 CPU1
[ 2763.366050] ---- ----
[ 2763.366051] lock(fs_reclaim);
[ 2763.366054] lock(&vma->vm_sequence/1);
[ 2763.366056] lock(fs_reclaim);
[ 2763.366058] lock(&mapping->i_mmap_rwsem);
[ 2763.366061]
*** DEADLOCK ***

[ 2763.366063] 1 lock held by khugepaged/42:
[ 2763.366064] #0: (fs_reclaim){+.+.}, at: [<ffffffff810e99dc>] fs_reclaim_acquire+0x12/0x35
[ 2763.366068]
stack backtrace:
[ 2763.366071] CPU: 2 PID: 42 Comm: khugepaged Not tainted 4.13.0-next-20170913-dbg-00039-ge3c06ea4b028-dirty #1837
[ 2763.366073] Call Trace:
[ 2763.366077] dump_stack+0x67/0x8e
[ 2763.366080] print_circular_bug+0x2a1/0x2af
[ 2763.366083] ? graph_unlock+0x69/0x69
[ 2763.366085] check_prev_add+0x76/0x20d
[ 2763.366087] ? graph_unlock+0x69/0x69
[ 2763.366090] __lock_acquire+0xa72/0xca0
[ 2763.366093] ? __save_stack_trace+0xa3/0xbf
[ 2763.366096] lock_acquire+0x176/0x19e
[ 2763.366098] ? rmap_walk_file+0x5a/0x142
[ 2763.366100] down_read+0x3b/0x55
[ 2763.366102] ? rmap_walk_file+0x5a/0x142
[ 2763.366103] rmap_walk_file+0x5a/0x142
[ 2763.366106] page_referenced+0xfc/0x134
[ 2763.366108] ? page_vma_mapped_walk_done.isra.17+0xb/0xb
[ 2763.366109] ? page_get_anon_vma+0x6d/0x6d
[ 2763.366112] shrink_active_list+0x1ac/0x37d
[ 2763.366115] shrink_node_memcg.constprop.72+0x3ca/0x567
[ 2763.366118] ? ___might_sleep+0xd5/0x234
[ 2763.366121] shrink_node+0x3f/0x14c
[ 2763.366123] try_to_free_pages+0x288/0x47a
[ 2763.366126] __alloc_pages_slowpath+0x3a7/0xa49
[ 2763.366128] ? ___might_sleep+0xd5/0x234
[ 2763.366131] __alloc_pages_nodemask+0xf1/0x1f9
[ 2763.366133] khugepaged+0xc8/0x167c
[ 2763.366138] ? remove_wait_queue+0x47/0x47
[ 2763.366140] ? collapse_shmem.isra.45+0x828/0x828
[ 2763.366142] kthread+0x133/0x13b
[ 2763.366145] ? __list_del_entry+0x1d/0x1d
[ 2763.366147] ret_from_fork+0x27/0x40

-ss