[PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change

From: Lyude
Date: Tue May 31 2016 - 12:51:23 EST

If an MST device is disconnected while the machine is suspended, the
number of connectors will change as well after we call
intel_dp_mst_resume(). This means that any previous atomic state we had
before suspending is no longer valid, since it'll still be pointing to
missing connectors. We need to check for this before committing the
state, otherwise we'll kernel panic on resume whenever if any MST
display was disconnected before we started resuming:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper]
Call Trace:
[<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915]
[<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0
[<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0
[<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm]
[<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90
[<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm]
[<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915]
[<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90
[<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915]
[<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915]
[<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0
[<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190
[<ffffffff814da455>] device_resume+0xd5/0x1f0
[<ffffffff814da58d>] async_resume+0x1d/0x50
[<ffffffff810b6718>] async_run_entry_fn+0x48/0x150
[<ffffffff810acc19>] process_one_work+0x1e9/0x5c0
[<ffffffff810acb96>] ? process_one_work+0x166/0x5c0
[<ffffffff810ad038>] worker_thread+0x48/0x4e0
[<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0
[<ffffffff810b3794>] kthread+0xe4/0x100
[<ffffffff81742672>] ret_from_fork+0x22/0x50
[<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200

Changes since v1:
- Move drm_atomic_state_free() call down so we're holding the
appropriate locks when destroying the atomic state
Changes since v2:
- Check that state != NULL before we start accessing it's members

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Upstream fix: 0552f7651bc2 ("drm/i915/mst: use reference counted
connectors. (v3)")
Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
Signed-off-by: Lyude <cpaul@xxxxxxxxxx>
drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0104a06..6fdb90e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15958,6 +15958,18 @@ void intel_display_resume(struct drm_device *dev)
ret = drm_modeset_lock_all_ctx(dev, &ctx);

+ /*
+ * With MST, the number of connectors can change between suspend and
+ * resume, which means that the state we want to restore might now be
+ * impossible to use since it'll be pointing to non-existant
+ * connectors.
+ */
+ if (ret == 0 && state &&
+ state->num_connector != dev->mode_config.num_connector) {
+ drm_atomic_state_free(state);
+ state = NULL;
+ }
if (ret == 0 && !setup) {
setup = true;