[PATCH 0/2] Workaround for MST displays failing to come back after resume

From: Lyude
Date: Fri Mar 11 2016 - 10:58:17 EST

This is a follow-up to the original patches I sent to try to fix the issue of
drm_dp_mst_topology_mgr_resume() not working due to aux transactions failing
temporarily after resuming the machine. Unfortunately I haven't been able to
figure out the actual cause of this issue; there don't seem to be any IRQs
interrupting the DP aux transactions and everything seems to be normal except
for the amount of time this MST dock takes to become available again over the DP
aux channel. I have made a few discoveries though:

The reason why calling intel_dp_mst_resume() before calling
intel_runtime_pm_enable_interrupts() worked is due to how
intel_dp_aux_wait_done() works:

static uint32_t
intel_dp_aux_wait_done(struct intel_dp *intel_dp, bool has_aux_irq)
struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
struct drm_device *dev = intel_dig_port->base.base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
i915_reg_t ch_ctl = intel_dp->aux_ch_ctl_reg;
uint32_t status;
bool done;

#define C (((status = I915_READ_NOTRACE(ch_ctl)) & DP_AUX_CH_CTL_SEND_BUSY) == 0)
if (has_aux_irq)
done = wait_event_timeout(dev_priv->gmbus_wait_queue, C,
done = wait_for_atomic(C, 10) == 0;
if (!done)
DRM_ERROR("dp aux hw did not signal timeout (has irq: %i)!\n",
#undef C

return status;

When calling this function without interrupts enabled, wait_event_timeout() ends
up timing out after 10ms, manually checking the DP AUX status register, and
discovering that the aux transaction succeeded. This makes the aux transactions
take quite a while, but still manage to work. Because of this, there's always a
10ms delay each time we do a transaction, and we end up delaying things long
enough for the aux transactions to become functional again which results in
intel_dp_mst_resume() working. With interrupts enabled, we get notified of the
timeouts within a period of 3ms five times in a row, which doesn't give enough
time for the aux transactions to start working again. If we change the timeout
to something shorter like 3ms, calling intel_dp_mst_resume() before
intel_runtime_pm_enable_interrupts() stops working.

So, my only possible thought for why this issue occurs is that the MST dock
simply needs more time to respond. It's possible this issue has actually always
been here, but we just never managed to have a skl machine resume quickly enough
to notice it. The T560, the only machine I seem to be able to reproduce this
issue on, does happen to be the fastest model out of all of the Skylake
production machines I have available here.

It should be noted that for the second patch, I've considered a different
workaround: calling intel_dp_check_mst_status() before calling
drm_dp_mst_topology_mgr_resume(). This is another viable solution since it
causes us to try to read the ESI from the dock using intel_dp_dpcd_read_wake(),
which retries aux transactions enough times to give the dock time to resume. If
everyone would rather that solution, I'd be happy to post that version of the
patch instead.

Lyude (2):
drm/i915: Call intel_dp_mst_resume() before resuming displays
drm/i915: Retry after 30ms if we fail to resume DP MST

drivers/gpu/drm/i915/i915_drv.c | 4 ++--
drivers/gpu/drm/i915/intel_dp.c | 13 +++++++++++++
2 files changed, 15 insertions(+), 2 deletions(-)