[PATCH 4.14 164/164] x86/intel_rdt: Fix potential deadlock during resctrl unmount
From: Greg Kroah-Hartman
Date: Tue Dec 12 2017 - 07:54:45 EST
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Reinette Chatre <reinette.chatre@xxxxxxxxx>
[ Upstream commit 36b6f9fcb8928c06b6638a4cf91bc9d69bb49aa2 ]
Lockdep warns about a potential deadlock:
[ 66.782842] ======================================================
[ 66.782888] WARNING: possible circular locking dependency detected
[ 66.782937] 4.14.0-rc2-test-test+ #48 Not tainted
[ 66.782983] ------------------------------------------------------
[ 66.783052] umount/336 is trying to acquire lock:
[ 66.783117] (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff81032395>] rdt_kill_sb+0x215/0x390
[ 66.783193]
but task is already holding lock:
[ 66.783244] (rdtgroup_mutex){+.+.}, at: [<ffffffff810321b6>] rdt_kill_sb+0x36/0x390
[ 66.783305]
which lock already depends on the new lock.
[ 66.783364]
the existing dependency chain (in reverse order) is:
[ 66.783419]
-> #3 (rdtgroup_mutex){+.+.}:
[ 66.783467] __lock_acquire+0x1293/0x13f0
[ 66.783509] lock_acquire+0xaf/0x220
[ 66.783543] __mutex_lock+0x71/0x9b0
[ 66.783575] mutex_lock_nested+0x1b/0x20
[ 66.783610] intel_rdt_online_cpu+0x3b/0x430
[ 66.783649] cpuhp_invoke_callback+0xab/0x8e0
[ 66.783687] cpuhp_thread_fun+0x7a/0x150
[ 66.783722] smpboot_thread_fn+0x1cc/0x270
[ 66.783764] kthread+0x16e/0x190
[ 66.783794] ret_from_fork+0x27/0x40
[ 66.783825]
-> #2 (cpuhp_state){+.+.}:
[ 66.783870] __lock_acquire+0x1293/0x13f0
[ 66.783906] lock_acquire+0xaf/0x220
[ 66.783938] cpuhp_issue_call+0x102/0x170
[ 66.783974] __cpuhp_setup_state_cpuslocked+0x154/0x2a0
[ 66.784023] __cpuhp_setup_state+0xc7/0x170
[ 66.784061] page_writeback_init+0x43/0x67
[ 66.784097] pagecache_init+0x43/0x4a
[ 66.784131] start_kernel+0x3ad/0x3f7
[ 66.784165] x86_64_start_reservations+0x2a/0x2c
[ 66.784204] x86_64_start_kernel+0x72/0x75
[ 66.784241] verify_cpu+0x0/0xfb
[ 66.784270]
-> #1 (cpuhp_state_mutex){+.+.}:
[ 66.784319] __lock_acquire+0x1293/0x13f0
[ 66.784355] lock_acquire+0xaf/0x220
[ 66.784387] __mutex_lock+0x71/0x9b0
[ 66.784419] mutex_lock_nested+0x1b/0x20
[ 66.784454] __cpuhp_setup_state_cpuslocked+0x52/0x2a0
[ 66.784497] __cpuhp_setup_state+0xc7/0x170
[ 66.784535] page_alloc_init+0x28/0x30
[ 66.784569] start_kernel+0x148/0x3f7
[ 66.784602] x86_64_start_reservations+0x2a/0x2c
[ 66.784642] x86_64_start_kernel+0x72/0x75
[ 66.784678] verify_cpu+0x0/0xfb
[ 66.784707]
-> #0 (cpu_hotplug_lock.rw_sem){++++}:
[ 66.784759] check_prev_add+0x32f/0x6e0
[ 66.784794] __lock_acquire+0x1293/0x13f0
[ 66.784830] lock_acquire+0xaf/0x220
[ 66.784863] cpus_read_lock+0x3d/0xb0
[ 66.784896] rdt_kill_sb+0x215/0x390
[ 66.784930] deactivate_locked_super+0x3e/0x70
[ 66.784968] deactivate_super+0x40/0x60
[ 66.785003] cleanup_mnt+0x3f/0x80
[ 66.785034] __cleanup_mnt+0x12/0x20
[ 66.785070] task_work_run+0x8b/0xc0
[ 66.785103] exit_to_usermode_loop+0x94/0xa0
[ 66.786804] syscall_return_slowpath+0xe8/0x150
[ 66.788502] entry_SYSCALL_64_fastpath+0xab/0xad
[ 66.790194]
other info that might help us debug this:
[ 66.795139] Chain exists of:
cpu_hotplug_lock.rw_sem --> cpuhp_state --> rdtgroup_mutex
[ 66.800035] Possible unsafe locking scenario:
[ 66.803267] CPU0 CPU1
[ 66.804867] ---- ----
[ 66.806443] lock(rdtgroup_mutex);
[ 66.808002] lock(cpuhp_state);
[ 66.809565] lock(rdtgroup_mutex);
[ 66.811110] lock(cpu_hotplug_lock.rw_sem);
[ 66.812608]
*** DEADLOCK ***
[ 66.816983] 2 locks held by umount/336:
[ 66.818418] #0: (&type->s_umount_key#35){+.+.}, at: [<ffffffff81229738>] deactivate_super+0x38/0x60
[ 66.819922] #1: (rdtgroup_mutex){+.+.}, at: [<ffffffff810321b6>] rdt_kill_sb+0x36/0x390
When the resctrl filesystem is unmounted the locks should be obtain in the
locks in the same order as was done when the cpus came online:
cpu_hotplug_lock before rdtgroup_mutex.
This also requires to switch the static_branch_disable() calls to the
_cpulocked variant because now cpu hotplug lock is held already.
[ tglx: Switched to cpus_read_[un]lock ]
Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@xxxxxxxxx>
Acked-by: Vikas Shivappa <vikas.shivappa@xxxxxxxxxxxxxxx>
Acked-by: Fenghua Yu <fenghua.yu@xxxxxxxxx>
Acked-by: Tony Luck <tony.luck@xxxxxxxxx>
Link: https://lkml.kernel.org/r/cc292e76be073f7260604651711c47b09fd0dc81.1508490116.git.reinette.chatre@xxxxxxxxx
Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1297,9 +1297,7 @@ static void rmdir_all_sub(void)
kfree(rdtgrp);
}
/* Notify online CPUs to update per cpu storage and PQR_ASSOC MSR */
- get_online_cpus();
update_closid_rmid(cpu_online_mask, &rdtgroup_default);
- put_online_cpus();
kernfs_remove(kn_info);
kernfs_remove(kn_mongrp);
@@ -1310,6 +1308,7 @@ static void rdt_kill_sb(struct super_blo
{
struct rdt_resource *r;
+ cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
/*Put everything back to default values. */
@@ -1317,11 +1316,12 @@ static void rdt_kill_sb(struct super_blo
reset_all_ctrls(r);
cdp_disable();
rmdir_all_sub();
- static_branch_disable(&rdt_alloc_enable_key);
- static_branch_disable(&rdt_mon_enable_key);
- static_branch_disable(&rdt_enable_key);
+ static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
+ static_branch_disable_cpuslocked(&rdt_mon_enable_key);
+ static_branch_disable_cpuslocked(&rdt_enable_key);
kernfs_kill_sb(sb);
mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
}
static struct file_system_type rdt_fs_type = {