Re: [Intel-gfx] [PATCH] drm/i915: fix deadlock on lid open

From: Daniel Vetter
Date: Wed Mar 30 2016 - 05:54:38 EST


On Wed, Mar 30, 2016 at 11:08:33AM +0200, Bjørn Mork wrote:
> commit e2c8b8701e2d moved modeset locking inside resume/suspend
> functions, but missed a code path only executed on lid close/open
> on older hardware. The result was a deadlock when closing and
> opening the lid without suspending on such hardware:
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 4.6.0-rc1 #385 Not tainted
> ---------------------------------------------
> kworker/0:3/88 is trying to acquire lock:
> (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]
>
> but task is already holding lock:
> (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&dev->mode_config.mutex);
> lock(&dev->mode_config.mutex);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 7 locks held by kworker/0:3/88:
> #0: ("kacpi_notify"){++++.+}, at: [<ffffffff81068dfc>] process_one_work+0x14a/0x50b
> #1: ((&dpc->work)#2){+.+.+.}, at: [<ffffffff81068dfc>] process_one_work+0x14a/0x50b
> #2: ((acpi_lid_notifier).rwsem){++++.+}, at: [<ffffffff8106f874>] __blocking_notifier_call_chain+0x34/0x65
> #3: (&dev_priv->modeset_restore_lock){+.+.+.}, at: [<ffffffffa0664cf6>] intel_lid_notify+0x3c/0xd9 [i915]
> #4: (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]
> #5: (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa02d0d59>] drm_modeset_lock_all+0x48/0xa6 [drm]
> #6: (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa02d0b2a>] modeset_lock+0x13c/0x1cd [drm]
>
> stack backtrace:
> CPU: 0 PID: 88 Comm: kworker/0:3 Not tainted 4.6.0-rc1 #385
> Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
> Workqueue: kacpi_notify acpi_os_execute_deferred
> 0000000000000000 ffff88022fd5f990 ffffffff8124af06 ffffffff825b39c0
> ffffffff825b39c0 ffff88022fd5fa60 ffffffff8108f547 ffff88022fd5fa70
> 000000008108e817 ffff880230236cc0 0000000000000000 ffffffff825b39c0
> Call Trace:
> [<ffffffff8124af06>] dump_stack+0x67/0x90
> [<ffffffff8108f547>] __lock_acquire+0xdb5/0xf71
> [<ffffffff8108bd2c>] ? look_up_lock_class+0xbe/0x10a
> [<ffffffff8108fae2>] lock_acquire+0x137/0x1cb
> [<ffffffff8108fae2>] ? lock_acquire+0x137/0x1cb
> [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
> [<ffffffff8148202f>] mutex_lock_nested+0x7e/0x3a4
> [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
> [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
> [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
> [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]
> [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
> [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
> [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
> [<ffffffffa02d0bf7>] ? drm_modeset_lock+0x17/0x24 [drm]
> [<ffffffffa02d0c8b>] ? drm_modeset_lock_all_ctx+0x87/0xa1 [drm]
> [<ffffffffa0664d6a>] intel_lid_notify+0xb0/0xd9 [i915]
> [<ffffffff8106f4c6>] notifier_call_chain+0x4a/0x6c
> [<ffffffff8106f88d>] __blocking_notifier_call_chain+0x4d/0x65
> [<ffffffff8106f8b9>] blocking_notifier_call_chain+0x14/0x16
> [<ffffffffa0011215>] acpi_lid_send_state+0x83/0xad [button]
> [<ffffffffa00112a6>] acpi_button_notify+0x41/0x132 [button]
> [<ffffffff812b07df>] acpi_device_notify+0x19/0x1b
> [<ffffffff812c8570>] acpi_ev_notify_dispatch+0x49/0x64
> [<ffffffff812ab9fb>] acpi_os_execute_deferred+0x14/0x20
> [<ffffffff81068f17>] process_one_work+0x265/0x50b
> [<ffffffff810696f5>] worker_thread+0x1fc/0x2dd
> [<ffffffff810694f9>] ? rescuer_thread+0x309/0x309
> [<ffffffff810694f9>] ? rescuer_thread+0x309/0x309
> [<ffffffff8106e2d6>] kthread+0xe0/0xe8
> [<ffffffff8107bc47>] ? local_clock+0x19/0x22
> [<ffffffff81484f42>] ret_from_fork+0x22/0x40
> [<ffffffff8106e1f6>] ? kthread_create_on_node+0x1b5/0x1b5
>
> Fixes: e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.")
> Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
> Signed-off-by: Bjørn Mork <bjorn@xxxxxxx>

Oops, that one's pretty silly. Unfortunately we don't have any such
machines in CI yet, and it wouldn't be possible to exercise the lid
notifier automatically.

Thanks for your fix, applied.
-Daniel

> ---
> drivers/gpu/drm/i915/intel_lvds.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c
> index 30a8403a8f4f..cd9fe609aefb 100644
> --- a/drivers/gpu/drm/i915/intel_lvds.c
> +++ b/drivers/gpu/drm/i915/intel_lvds.c
> @@ -478,11 +478,8 @@ static int intel_lid_notify(struct notifier_block *nb, unsigned long val,
> * and as part of the cleanup in the hw state restore we also redisable
> * the vga plane.
> */
> - if (!HAS_PCH_SPLIT(dev)) {
> - drm_modeset_lock_all(dev);
> + if (!HAS_PCH_SPLIT(dev))
> intel_display_resume(dev);
> - drm_modeset_unlock_all(dev);
> - }
>
> dev_priv->modeset_restore = MODESET_DONE;
>
> --
> 2.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch