[PATCH 4.16 46/72] drm/nouveau: Fix deadlock in nv50_mstm_register_connector()

From: Greg Kroah-Hartman
Date: Mon May 14 2018 - 03:05:41 EST


4.16-stable review patch. If anyone has any objections, please let me know.

------------------

From: Lyude Paul <lyude@xxxxxxxxxx>

commit 352672db857290ab5b0e2b6a99c414f92bee024c upstream.

Currently; we're grabbing all of the modesetting locks before adding MST
connectors to fbdev. This isn't actually necessary, and causes a
deadlock as well:

======================================================
WARNING: possible circular locking dependency detected
4.17.0-rc3Lyude-Test+ #1 Tainted: G O
------------------------------------------------------
kworker/1:0/18 is trying to acquire lock:
00000000c832f62d (&helper->lock){+.+.}, at: drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]

but task is already holding lock:
00000000942e28e2 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8e/0x1c0 [drm]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (crtc_ww_class_mutex){+.+.}:
ww_mutex_lock+0x43/0x80
drm_modeset_lock+0x71/0x130 [drm]
drm_helper_probe_single_connector_modes+0x7d/0x6b0 [drm_kms_helper]
drm_setup_crtcs+0x15e/0xc90 [drm_kms_helper]
__drm_fb_helper_initial_config_and_unlock+0x29/0x480 [drm_kms_helper]
nouveau_fbcon_init+0x138/0x1a0 [nouveau]
nouveau_drm_load+0x173/0x7e0 [nouveau]
drm_dev_register+0x134/0x1c0 [drm]
drm_get_pci_dev+0x8e/0x160 [drm]
nouveau_drm_probe+0x1a9/0x230 [nouveau]
pci_device_probe+0xcd/0x150
driver_probe_device+0x30b/0x480
__driver_attach+0xbc/0xe0
bus_for_each_dev+0x67/0x90
bus_add_driver+0x164/0x260
driver_register+0x57/0xc0
do_one_initcall+0x4d/0x323
do_init_module+0x5b/0x1f8
load_module+0x20e5/0x2ac0
__do_sys_finit_module+0xb7/0xd0
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #2 (crtc_ww_class_acquire){+.+.}:
drm_helper_probe_single_connector_modes+0x58/0x6b0 [drm_kms_helper]
drm_setup_crtcs+0x15e/0xc90 [drm_kms_helper]
__drm_fb_helper_initial_config_and_unlock+0x29/0x480 [drm_kms_helper]
nouveau_fbcon_init+0x138/0x1a0 [nouveau]
nouveau_drm_load+0x173/0x7e0 [nouveau]
drm_dev_register+0x134/0x1c0 [drm]
drm_get_pci_dev+0x8e/0x160 [drm]
nouveau_drm_probe+0x1a9/0x230 [nouveau]
pci_device_probe+0xcd/0x150
driver_probe_device+0x30b/0x480
__driver_attach+0xbc/0xe0
bus_for_each_dev+0x67/0x90
bus_add_driver+0x164/0x260
driver_register+0x57/0xc0
do_one_initcall+0x4d/0x323
do_init_module+0x5b/0x1f8
load_module+0x20e5/0x2ac0
__do_sys_finit_module+0xb7/0xd0
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #1 (&dev->mode_config.mutex){+.+.}:
drm_setup_crtcs+0x10c/0xc90 [drm_kms_helper]
__drm_fb_helper_initial_config_and_unlock+0x29/0x480 [drm_kms_helper]
nouveau_fbcon_init+0x138/0x1a0 [nouveau]
nouveau_drm_load+0x173/0x7e0 [nouveau]
drm_dev_register+0x134/0x1c0 [drm]
drm_get_pci_dev+0x8e/0x160 [drm]
nouveau_drm_probe+0x1a9/0x230 [nouveau]
pci_device_probe+0xcd/0x150
driver_probe_device+0x30b/0x480
__driver_attach+0xbc/0xe0
bus_for_each_dev+0x67/0x90
bus_add_driver+0x164/0x260
driver_register+0x57/0xc0
do_one_initcall+0x4d/0x323
do_init_module+0x5b/0x1f8
load_module+0x20e5/0x2ac0
__do_sys_finit_module+0xb7/0xd0
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&helper->lock){+.+.}:
__mutex_lock+0x70/0x9d0
drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
nv50_mstm_register_connector+0x2c/0x50 [nouveau]
drm_dp_add_port+0x2f5/0x420 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_add_port+0x33f/0x420 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_check_and_send_link_address+0x87/0xd0 [drm_kms_helper]
drm_dp_mst_link_probe_work+0x4d/0x80 [drm_kms_helper]
process_one_work+0x20d/0x650
worker_thread+0x3a/0x390
kthread+0x11e/0x140
ret_from_fork+0x3a/0x50

other info that might help us debug this:
Chain exists of:
&helper->lock --> crtc_ww_class_acquire --> crtc_ww_class_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(crtc_ww_class_mutex);
lock(crtc_ww_class_acquire);
lock(crtc_ww_class_mutex);
lock(&helper->lock);

*** DEADLOCK ***
5 locks held by kworker/1:0/18:
#0: 000000004a05cd50 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x187/0x650
#1: 00000000601c11d1 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x187/0x650
#2: 00000000586ca0df (&dev->mode_config.mutex){+.+.}, at: drm_modeset_lock_all+0x3a/0x1b0 [drm]
#3: 00000000d3ca0ffa (crtc_ww_class_acquire){+.+.}, at: drm_modeset_lock_all+0x44/0x1b0 [drm]
#4: 00000000942e28e2 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8e/0x1c0 [drm]

stack backtrace:
CPU: 1 PID: 18 Comm: kworker/1:0 Tainted: G O 4.17.0-rc3Lyude-Test+ #1
Hardware name: Gateway FX6840/FX6840, BIOS P01-A3 05/17/2010
Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
Call Trace:
dump_stack+0x85/0xcb
print_circular_bug.isra.38+0x1ce/0x1db
__lock_acquire+0x128f/0x1350
? lock_acquire+0x9f/0x200
? lock_acquire+0x9f/0x200
? __ww_mutex_lock.constprop.13+0x8f/0x1000
lock_acquire+0x9f/0x200
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
__mutex_lock+0x70/0x9d0
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
? ww_mutex_lock+0x43/0x80
? _cond_resched+0x15/0x30
? ww_mutex_lock+0x43/0x80
? drm_modeset_lock+0xb2/0x130 [drm]
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
nv50_mstm_register_connector+0x2c/0x50 [nouveau]
drm_dp_add_port+0x2f5/0x420 [drm_kms_helper]
? mark_held_locks+0x50/0x80
? kfree+0xcf/0x2a0
? drm_dp_check_mstb_guid+0xd6/0x120 [drm_kms_helper]
? trace_hardirqs_on_caller+0xed/0x180
? drm_dp_check_mstb_guid+0xd6/0x120 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_add_port+0x33f/0x420 [drm_kms_helper]
? nouveau_connector_aux_xfer+0x7c/0xb0 [nouveau]
? find_held_lock+0x2d/0x90
? drm_dp_dpcd_access+0xd9/0xf0 [drm_kms_helper]
? __mutex_unlock_slowpath+0x3b/0x280
? drm_dp_dpcd_access+0xd9/0xf0 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_check_and_send_link_address+0x87/0xd0 [drm_kms_helper]
drm_dp_mst_link_probe_work+0x4d/0x80 [drm_kms_helper]
process_one_work+0x20d/0x650
worker_thread+0x3a/0x390
? process_one_work+0x650/0x650
kthread+0x11e/0x140
? kthread_create_worker_on_cpu+0x50/0x50
ret_from_fork+0x3a/0x50

Taking example from i915, the only time we need to hold any modesetting
locks is when changing the port on the mstc, and in that case we only
need to hold the connection mutex.

Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx>
Cc: Karol Herbst <kherbst@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx>
Signed-off-by: Ben Skeggs <bskeggs@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/gpu/drm/nouveau/nv50_display.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

--- a/drivers/gpu/drm/nouveau/nv50_display.c
+++ b/drivers/gpu/drm/nouveau/nv50_display.c
@@ -3272,10 +3272,11 @@ nv50_mstm_destroy_connector(struct drm_d

drm_connector_unregister(&mstc->connector);

- drm_modeset_lock_all(drm->dev);
drm_fb_helper_remove_one_connector(&drm->fbcon->helper, &mstc->connector);
+
+ drm_modeset_lock(&drm->dev->mode_config.connection_mutex, NULL);
mstc->port = NULL;
- drm_modeset_unlock_all(drm->dev);
+ drm_modeset_unlock(&drm->dev->mode_config.connection_mutex);

drm_connector_unreference(&mstc->connector);
}
@@ -3285,9 +3286,7 @@ nv50_mstm_register_connector(struct drm_
{
struct nouveau_drm *drm = nouveau_drm(connector->dev);

- drm_modeset_lock_all(drm->dev);
drm_fb_helper_add_one_connector(&drm->fbcon->helper, connector);
- drm_modeset_unlock_all(drm->dev);

drm_connector_register(connector);
}