lockdep reports possible recursive locking for sb_writers from do_iter_write and do_sendfile

From: Andrei Vagin
Date: Fri Jul 07 2017 - 11:27:56 EST


Hello,

We run CRIU tests for Linus' tree and today we found this warning:

[ 27.131931] ============================================
[ 27.132008] WARNING: possible recursive locking detected
[ 27.132085] 4.12.0+ #1 Not tainted
[ 27.132158] --------------------------------------------
[ 27.132243] criu/1537 is trying to acquire lock:
[ 27.132330] (sb_writers#5){.+.+.+}, at: [<ffffffffac3af8a4>]
do_iter_write+0x1d4/0x390
[ 27.132429]
[ 27.132429] but task is already holding lock:
[ 27.132511] (sb_writers#5){.+.+.+}, at: [<ffffffffac3b0861>] +0x671/0x6e0
[ 27.132594]
[ 27.132594] other info that might help us debug this:
[ 27.132671] Possible unsafe locking scenario:
[ 27.132671]
[ 27.132747] CPU0
[ 27.132819] ----
[ 27.132892] lock(sb_writers#5);
[ 27.132972] lock(sb_writers#5);
[ 27.133056]
[ 27.133056] *** DEADLOCK ***
[ 27.133056]
[ 27.133154] May be due to missing lock nesting notation
[ 27.133154]
[ 27.133245] 1 lock held by criu/1537:
[ 27.133325] #0: (sb_writers#5){.+.+.+}, at: [<ffffffffac3b0861>]
do_sendfile+0x671/0x6e0
[ 27.133424]
[ 27.133424] stack backtrace:
[ 27.133515] CPU: 0 PID: 1537 Comm: criu Not tainted 4.12.0+ #1
[ 27.133598] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[ 27.133690] Call Trace:
[ 27.133774] dump_stack+0x85/0xc2
[ 27.133859] __lock_acquire+0x1dbc/0x1e70
[ 27.133949] ? debug_check_no_locks_freed+0x1c0/0x1c0
[ 27.134034] ? do_splice_direct+0x113/0x160
[ 27.134117] ? do_sendfile+0x37e/0x6e0
[ 27.134199] ? SyS_sendfile64+0x122/0x130
[ 27.134282] ? entry_SYSCALL_64_fastpath+0x23/0xc2
[ 27.134367] ? find_held_lock+0x119/0x150
[ 27.134456] ? __lock_is_held+0x93/0x100
[ 27.134539] ? current_time+0x18/0x80
[ 27.134623] lock_acquire+0x101/0x220
[ 27.134706] ? lock_acquire+0x101/0x220
[ 27.134791] ? do_iter_write+0x1d4/0x390
[ 27.134895] __sb_start_write+0xc6/0x1e0
[ 27.134998] ? do_iter_write+0x1d4/0x390
[ 27.135082] do_iter_write+0x1d4/0x390
[ 27.135167] ? kasan_unpoison_shadow+0x36/0x50
[ 27.135256] ? kasan_kmalloc+0xad/0xe0
[ 27.135346] vfs_iter_write+0x4e/0x70
[ 27.135430] iter_file_splice_write+0x409/0x650
[ 27.135515] ? page_cache_pipe_buf_steal+0x140/0x140
[ 27.135599] ? mark_held_locks+0x8e/0xc0
[ 27.135686] ? common_file_perm+0xe6/0x2c0
[ 27.135769] ? trace_hardirqs_on_caller+0x18b/0x270
[ 27.135853] ? __fsnotify_parent+0x2c/0x130
[ 27.135942] ? rw_verify_area+0x78/0x140
[ 27.136026] direct_splice_actor+0x94/0xb0
[ 27.136110] splice_direct_to_actor+0x1b9/0x400
[ 27.136194] ? generic_pipe_buf_nosteal+0x10/0x10
[ 27.136278] ? do_splice_to+0xc0/0xc0
[ 27.136362] ? rw_verify_area+0x78/0x140
[ 27.136446] do_splice_direct+0x113/0x160
[ 27.136537] ? splice_direct_to_actor+0x400/0x400
[ 27.136621] ? __sb_start_write+0xed/0x1e0
[ 27.136704] ? do_sendfile+0x671/0x6e0
[ 27.136788] ? __fget_light+0xa7/0xc0
[ 27.136872] do_sendfile+0x37e/0x6e0
[ 27.136961] ? do_compat_pwritev64+0xa0/0xa0
[ 27.137047] ? ext4_find_unwritten_pgoff.isra.14+0x480/0x480
[ 27.137132] ? vfs_read+0x15d/0x1d0
[ 27.137215] SyS_sendfile64+0x122/0x130
[ 27.137297] ? SyS_sendfile+0x120/0x120
[ 27.137385] ? trace_hardirqs_on_caller+0x18b/0x270
[ 27.137469] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 27.137554] entry_SYSCALL_64_fastpath+0x23/0xc2
[ 27.137638] RIP: 0033:0x7f2bb27e8f5a
[ 27.137720] RSP: 002b:00007ffc53a398d8 EFLAGS: 00000206 ORIG_RAX:
0000000000000028
[ 27.137813] RAX: ffffffffffffffda RBX: ffffffffac168175 RCX: 00007f2bb27e8f5a
[ 27.137897] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[ 27.137985] RBP: ffff8801c8547f98 R08: 00000000004af768 R09: ffffffffffffe6f0
[ 27.138069] R10: 00000000000059a8 R11: 0000000000000206 R12: 0000000001e49f00
[ 27.138152] R13: 00007f2bb33081e0 R14: 000000000000000b R15: 0000000000000001
[ 27.138243] ? trace_hardirqs_off_caller+0xc5/0x110

$ git describe origin/master
v4.12-7934-g9f45efb

Here is all logs: https://goo.gl/TK6VSy

Thanks.
Andrei