Re: [PATCH] block/mq: Cure cpu hotplug lock inversion
From: Peter Zijlstra
Date: Thu May 04 2017 - 10:19:10 EST
On Thu, May 04, 2017 at 07:56:57AM -0600, Jens Axboe wrote:
> On 05/04/2017 07:05 AM, Peter Zijlstra wrote:
> >
> > By poking at /debug/sched_features I triggered the following splat:
> >
> > [] ======================================================
> > [] WARNING: possible circular locking dependency detected
> > [] 4.11.0-00873-g964c8b7-dirty #694 Not tainted
> > [] ------------------------------------------------------
> > [] bash/2109 is trying to acquire lock:
> > [] (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff8120cb8b>] static_key_slow_dec+0x1b/0x50
> > []
> > [] but task is already holding lock:
> > [] (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
> > []
> > [] which lock already depends on the new lock.
> > []
> > []
> > [] the existing dependency chain (in reverse order) is:
> > []
> > [] -> #2 (&sb->s_type->i_mutex_key#4){+++++.}:
> > [] lock_acquire+0x100/0x210
> > [] down_write+0x28/0x60
> > [] start_creating+0x5e/0xf0
> > [] debugfs_create_dir+0x13/0x110
> > [] blk_mq_debugfs_register+0x21/0x70
> > [] blk_mq_register_dev+0x64/0xd0
> > [] blk_register_queue+0x6a/0x170
> > [] device_add_disk+0x22d/0x440
> > [] loop_add+0x1f3/0x280
> > [] loop_init+0x104/0x142
> > [] do_one_initcall+0x43/0x180
> > [] kernel_init_freeable+0x1de/0x266
> > [] kernel_init+0xe/0x100
> > [] ret_from_fork+0x31/0x40
> > []
> > [] -> #1 (all_q_mutex){+.+.+.}:
> > [] lock_acquire+0x100/0x210
> > [] __mutex_lock+0x6c/0x960
> > [] mutex_lock_nested+0x1b/0x20
> > [] blk_mq_init_allocated_queue+0x37c/0x4e0
> > [] blk_mq_init_queue+0x3a/0x60
> > [] loop_add+0xe5/0x280
> > [] loop_init+0x104/0x142
> > [] do_one_initcall+0x43/0x180
> > [] kernel_init_freeable+0x1de/0x266
> > [] kernel_init+0xe/0x100
> > [] ret_from_fork+0x31/0x40
> >
> > [] *** DEADLOCK ***
> > []
> > [] 3 locks held by bash/2109:
> > [] #0: (sb_writers#11){.+.+.+}, at: [<ffffffff81292bcd>] vfs_write+0x17d/0x1a0
> > [] #1: (debugfs_srcu){......}, at: [<ffffffff8155a90d>] full_proxy_write+0x5d/0xd0
> > [] #2: (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
> > []
> > [] stack backtrace:
> > [] CPU: 9 PID: 2109 Comm: bash Not tainted 4.11.0-00873-g964c8b7-dirty #694
> > [] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
> > [] Call Trace:
> >
> > [] lock_acquire+0x100/0x210
> > [] get_online_cpus+0x2a/0x90
> > [] static_key_slow_dec+0x1b/0x50
> > [] static_key_disable+0x20/0x30
> > [] sched_feat_write+0x131/0x170
> > [] full_proxy_write+0x97/0xd0
> > [] __vfs_write+0x28/0x120
> > [] vfs_write+0xb5/0x1a0
> > [] SyS_write+0x49/0xa0
> > [] entry_SYSCALL_64_fastpath+0x23/0xc2
> >
> > This is because of the cpu hotplug lock rework. Break the chain at #1
> > by reversing the lock acquisition order. This way i_mutex_key#4 no
> > longer depends on cpu_hotplug_lock and things are good.
>
> Thanks Peter, applied.
Note that the hotplug rework is still work-in-progress and lives in a
-tip branch.
That said, the patch is harmless outside of that, so yes it can travel
upstream independently. But note that mainline cannot yet trigger that
splat.