[PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs

From: Lyude
Date: Wed Aug 17 2016 - 15:56:16 EST

Next message: Bjorn Helgaas: "Re: [PATCH v5 0/3] pcie-designware: add iATU unroll feature"
Previous message: Lyude: "[PATCH v12 0/7] Finally fix watermarks"
In reply to: Lyude: "[PATCH v12 0/7] Finally fix watermarks"
Next in thread: Maarten Lankhorst: "Re: [PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Since the watermark calculations for Skylake are still broken, we're apt
to hitting underruns very easily under multi-monitor configurations.
While it would be lovely if this was fixed, it's not. Another problem
that's been coming from this however, is the mysterious issue of
underruns causing full system hangs. An easy way to reproduce this with
a skylake system:

- Get a laptop with a skylake GPU, and hook up two external monitors to
it
- Move the cursor from the built-in LCD to one of the external displays
as quickly as you can
- You'll get a few pipe underruns, and eventually the entire system will
just freeze.

After doing a lot of investigation and reading through the bspec, I
found the existence of the SAGV, which is responsible for adjusting the
system agent voltage and clock frequencies depending on how much power
we need. According to the bspec:

"The display engine access to system memory is blocked during the
adjustment time. SAGV defaults to enabled. Software must use the
GT-driver pcode mailbox to disable SAGV when the display engine is not
able to tolerate the blocking time."

The rest of the bspec goes on to explain that software can simply leave
the SAGV enabled, and disable it when we use interlaced pipes/have more
then one pipe active.

Sure enough, with this patchset the system hangs resulting from pipe
underruns on Skylake have completely vanished on my T460s. Additionally,
the bspec mentions turning off the SAGV with more then one pipe enabled
as a workaround for display underruns. While this patch doesn't entirely
fix that, it looks like it does improve the situation a little bit so
it's likely this is going to be required to make watermarks on Skylake
fully functional.

This will still need additional work in the future: we shouldn't be
enabling the SAGV if any of the currently enabled planes can't enable WM
levels that introduce latencies >= 30 Âs.

Changes since v11:
- Add skl_can_enable_sagv()
- Make sure we don't enable SAGV when not all planes can enable
watermarks >= the SAGV engine block time. I was originally going to
save this for later, but I recently managed to run into a machine
that was having problems with a single pipe configuration + SAGV.
- Make comparisons to I915_SKL_SAGV_NOT_CONTROLLED explicit
- Change I915_SAGV_DYNAMIC_FREQ to I915_SAGV_ENABLE
- Move printks outside of mutexes
- Don't print error messages twice
Changes since v10:
- Apparently sandybridge_pcode_read actually writes values and reads
them back, despite it's misleading function name. This means we've
been doing this mostly wrong and have been writing garbage to the
SAGV control. Because of this, we no longer attempt to read the SAGV
status during initialization (since there are no helpers for this).
- mlankhorst noticed that this patch was breaking on some very early
pre-release Skylake machines, which apparently don't allow you to
disable the SAGV. To prevent machines from failing tests due to SAGV
errors, if the first time we try to control the SAGV results in the
mailbox indicating an invalid command, we just disable future attempts
to control the SAGV state by setting dev_priv->skl_sagv_status to
I915_SKL_SAGV_NOT_CONTROLLED and make a note of it in dmesg.
- Move mutex_unlock() a little higher in skl_enable_sagv(). This
doesn't actually fix anything, but lets us release the lock a little
sooner since we're finished with it.
Changes since v9:
- Only enable/disable sagv on Skylake
Changes since v8:
- Add intel_state->modeset guard to the conditional for
skl_enable_sagv()
Changes since v7:
- Remove GEN9_SAGV_LOW_FREQ, replace with GEN9_SAGV_IS_ENABLED (that's
all we use it for anyway)
- Use GEN9_SAGV_IS_ENABLED instead of 0x1 for clarification
- Fix a styling error that snuck past me
Changes since v6:
- Protect skl_enable_sagv() with intel_state->modeset conditional in
intel_atomic_commit_tail()
Changes since v5:
- Don't use is_power_of_2. Makes things confusing
- Don't use the old state to figure out whether or not to
enable/disable the sagv, use the new one
- Split the loop in skl_disable_sagv into it's own function
- Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail()
Changes since v4:
- Use is_power_of_2 against active_crtcs to check whether we have > 1
pipe enabled
- Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0
enabled
- Call skl_sagv_enable/disable() from pre/post-plane updates
Changes since v3:
- Use time_before() to compare timeout to jiffies
Changes since v2:
- Really apply minor style nitpicks to patch this time
Changes since v1:
- Added comments about this probably being one of the requirements to
fixing Skylake's watermark issues
- Minor style nitpicks from Matt Roper
- Disable these functions on Broxton, since it doesn't have an SAGV

Signed-off-by: Lyude <cpaul@xxxxxxxxxx>
Cc: Matt Roper <matthew.d.roper@xxxxxxxxx>
Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cc: Ville SyrjÃlÃ <ville.syrjala@xxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
drivers/gpu/drm/i915/i915_drv.h | 7 ++
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_display.c | 11 +++
drivers/gpu/drm/i915/intel_drv.h | 3 +
drivers/gpu/drm/i915/intel_pm.c | 148 +++++++++++++++++++++++++++++++++++
5 files changed, 173 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 35caa9b..f20530bb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,6 +1949,13 @@ struct drm_i915_private {
struct i915_suspend_saved_registers regfile;
struct vlv_s0ix_state vlv_s0ix_state;

+ enum {
+ I915_SKL_SAGV_UNKNOWN = 0,
+ I915_SKL_SAGV_DISABLED,
+ I915_SKL_SAGV_ENABLED,
+ I915_SKL_SAGV_NOT_CONTROLLED
+ } skl_sagv_status;
+
struct {
/*
* Raw watermark latency values:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7419fbf..be82c49 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7153,6 +7153,10 @@ enum {
#define HSW_PCODE_DE_WRITE_FREQ_REQ 0x17
#define DISPLAY_IPS_CONTROL 0x19
#define HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL 0x1A
+#define GEN9_PCODE_SAGV_CONTROL 0x21
+#define GEN9_SAGV_DISABLE 0x0
+#define GEN9_SAGV_IS_DISABLED 0x1
+#define GEN9_SAGV_ENABLE 0x3
#define GEN6_PCODE_DATA _MMIO(0x138128)
#define GEN6_PCODE_FREQ_IA_RATIO_SHIFT 8
#define GEN6_PCODE_FREQ_RING_RATIO_SHIFT 16
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 781d22e..ca4b83f 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14156,6 +14156,13 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
intel_state->cdclk_pll_vco != dev_priv->cdclk_pll.vco))
dev_priv->display.modeset_commit_cdclk(state);

+ /*
+ * SKL workaround: bspec recommends we disable the SAGV when we
+ * have more then one pipe enabled
+ */
+ if (IS_SKYLAKE(dev_priv) && !skl_can_enable_sagv(state))
+ skl_disable_sagv(dev_priv);
+
intel_modeset_verify_disabled(dev);
}

@@ -14229,6 +14236,10 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
intel_modeset_verify_crtc(crtc, old_crtc_state, crtc->state);
}

+ if (IS_SKYLAKE(dev_priv) && intel_state->modeset &&
+ skl_can_enable_sagv(state))
+ skl_enable_sagv(dev_priv);
+
drm_atomic_helper_commit_hw_done(state);

if (intel_state->modeset)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 1c700b0..d203c77 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1722,6 +1722,9 @@ void ilk_wm_get_hw_state(struct drm_device *dev);
void skl_wm_get_hw_state(struct drm_device *dev);
void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
struct skl_ddb_allocation *ddb /* out */);
+bool skl_can_enable_sagv(struct drm_atomic_state *state);
+int skl_enable_sagv(struct drm_i915_private *dev_priv);
+int skl_disable_sagv(struct drm_i915_private *dev_priv);
uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config);
bool ilk_disable_lp_wm(struct drm_device *dev);
int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b4cf988..fed2bae8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2860,6 +2860,7 @@ bool ilk_disable_lp_wm(struct drm_device *dev)

#define SKL_DDB_SIZE 896 /* in blocks */
#define BXT_DDB_SIZE 512
+#define SKL_SAGV_BLOCK_TIME 30 /* Âs */

/*
* Return the index of a plane in the SKL DDB and wm result arrays. Primary
@@ -2883,6 +2884,153 @@ skl_wm_plane_id(const struct intel_plane *plane)
}
}

+/*
+ * SAGV dynamically adjusts the system agent voltage and clock frequencies
+ * depending on power and performance requirements. The display engine access
+ * to system memory is blocked during the adjustment time. Because of the
+ * blocking time, having this enabled can cause full system hangs and/or pipe
+ * underruns if we don't meet all of the following requirements:
+ *
+ * - <= 1 pipe enabled
+ * - All planes can enable watermarks for latencies >= SAGV engine block time
+ * - We're not using an interlaced display configuration
+ */
+int
+skl_enable_sagv(struct drm_i915_private *dev_priv)
+{
+ int ret;
+
+ if (dev_priv->skl_sagv_status == I915_SKL_SAGV_NOT_CONTROLLED ||
+ dev_priv->skl_sagv_status == I915_SKL_SAGV_ENABLED)
+ return 0;
+
+ DRM_DEBUG_KMS("Enabling the SAGV\n");
+ mutex_lock(&dev_priv->rps.hw_lock);
+
+ ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
+ GEN9_SAGV_ENABLE);
+
+ /* We don't need to wait for the SAGV when enabling */
+ mutex_unlock(&dev_priv->rps.hw_lock);
+
+ /*
+ * Some skl systems, pre-release machines in particular,
+ * don't actually have an SAGV.
+ */
+ if (ret == -ENOSYS) {
+ DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
+ dev_priv->skl_sagv_status = I915_SKL_SAGV_NOT_CONTROLLED;
+ return 0;
+ } else if (ret < 0) {
+ DRM_ERROR("Failed to enable the SAGV\n");
+ return ret;
+ }
+
+ dev_priv->skl_sagv_status = I915_SKL_SAGV_ENABLED;
+ return 0;
+}
+
+static int
+skl_do_sagv_disable(struct drm_i915_private *dev_priv)
+{
+ int ret;
+ uint32_t temp = GEN9_SAGV_DISABLE;
+
+ ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL,
+ &temp);
+ if (ret)
+ return ret;
+ else
+ return temp & GEN9_SAGV_IS_DISABLED;
+}
+
+int
+skl_disable_sagv(struct drm_i915_private *dev_priv)
+{
+ int ret, result;
+
+ if (dev_priv->skl_sagv_status == I915_SKL_SAGV_NOT_CONTROLLED ||
+ dev_priv->skl_sagv_status == I915_SKL_SAGV_DISABLED)
+ return 0;
+
+ DRM_DEBUG_KMS("Disabling the SAGV\n");
+ mutex_lock(&dev_priv->rps.hw_lock);
+
+ /* bspec says to keep retrying for at least 1 ms */
+ ret = wait_for(result = skl_do_sagv_disable(dev_priv), 1);
+ mutex_unlock(&dev_priv->rps.hw_lock);
+
+ if (ret == -ETIMEDOUT) {
+ DRM_ERROR("Request to disable SAGV timed out\n");
+ return -ETIMEDOUT;
+ }
+
+ /*
+ * Some skl systems, pre-release machines in particular,
+ * don't actually have an SAGV.
+ */
+ if (result == -ENOSYS) {
+ DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
+ dev_priv->skl_sagv_status = I915_SKL_SAGV_NOT_CONTROLLED;
+ return 0;
+ } else if (result < 0) {
+ DRM_ERROR("Failed to disable the SAGV\n");
+ return result;
+ }
+
+ dev_priv->skl_sagv_status = I915_SKL_SAGV_DISABLED;
+ return 0;
+}
+
+bool skl_can_enable_sagv(struct drm_atomic_state *state)
+{
+ struct drm_device *dev = state->dev;
+ struct drm_i915_private *dev_priv = to_i915(dev);
+ struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+ struct drm_crtc *crtc;
+ enum pipe pipe;
+ int level, plane;
+
+ /*
+ * SKL workaround: bspec recommends we disable the SAGV when we have
+ * more then one pipe enabled
+ *
+ * If there are no active CRTCs, no additional checks need be performed
+ */
+ if (hweight32(intel_state->active_crtcs) == 0)
+ return true;
+ else if (hweight32(intel_state->active_crtcs) > 1)
+ return false;
+
+ /* Since we're now guaranteed to only have one active CRTC... */
+ pipe = ffs(intel_state->active_crtcs) - 1;
+ crtc = dev_priv->pipe_to_crtc_mapping[pipe];
+
+ if (crtc->state->mode.flags & DRM_MODE_FLAG_INTERLACE)
+ return false;
+
+ for_each_plane(dev_priv, pipe, plane) {
+ /* Skip this plane if it's not enabled */
+ if (intel_state->wm_results.plane[pipe][plane][0] == 0)
+ continue;
+
+ /* Find the highest enabled wm level for this plane */
+ for (level = ilk_wm_max_level(dev);
+ intel_state->wm_results.plane[pipe][plane][level] == 0;
+ --level);
+
+ /*
+ * If any of the planes on this pipe don't enable wm levels
+ * that incur memory latencies higher then 30Âs we can't enable
+ * the SAGV
+ */
+ if (dev_priv->wm.skl_latency[level] < SKL_SAGV_BLOCK_TIME)
+ return false;
+ }
+
+ return true;
+}
+
static void
skl_ddb_get_pipe_allocation_limits(struct drm_device *dev,
const struct intel_crtc_state *cstate,
--
2.7.4

Next message: Bjorn Helgaas: "Re: [PATCH v5 0/3] pcie-designware: add iATU unroll feature"
Previous message: Lyude: "[PATCH v12 0/7] Finally fix watermarks"
In reply to: Lyude: "[PATCH v12 0/7] Finally fix watermarks"
Next in thread: Maarten Lankhorst: "Re: [PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]