Re: [PATCH v5 2/2] drm/i915: Acquire PUNIT->PMIC bus for intel_uncore_forcewake_reset()

From: Ingo Molnar
Date: Tue Oct 31 2017 - 05:50:19 EST



* Hans de Goede <j.w.r.degoede@xxxxxxxxx> wrote:

> intel_uncore_forcewake_reset() does forcewake puts and gets as such
> we need to make sure that no-one tries to access the PUNIT->PMIC bus
> (on systems where this bus is shared) while it runs, otherwise bad
> things happen.
>
> Normally this is taken care of by the i915_pmic_bus_access_notifier()
> which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other
> driver tries to access the PMIC bus, so that later forcewake gets are
> no-ops (for the duration of the bus access).
>
> But intel_uncore_forcewake_reset gets called in 3 cases:
> 1) Before registering the pmic_bus_access_notifier
> 2) After unregistering the pmic_bus_access_notifier
> 3) To reset forcewake state on a GPU reset
>
> In all 3 cases the i915_pmic_bus_access_notifier() protection is
> insufficient.
>
> This commit fixes this race by calling iosf_mbi_punit_acquire() before
> calling intel_uncore_forcewake_reset(). In the case where it is called
> directly after unregistering the pmic_bus_access_notifier, we need to
> hold the punit-lock over both calls to avoid a race where
> intel_uncore_fw_release_timer() may execute between the 2 calls.
>
> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx>
> Reviewed-by: Imre Deak <imre.deak@xxxxxxxxx>
> ---
> Changes in v2:
> -Rebase on current (July 6th 2017) drm-next
>
> Changes in v3:
> -Keep punit acquired / locked over the unregister + forcewake_reset
> call combo to avoid a race hitting between the 2 calls
> -This requires modifying iosf_mbi_unregister_pmic_bus_access_notifier
> to not take the lock itself, since we are the only users this is done
> in this same commit
>
> Changes in v4:
> -Fix missing rename in doc-comment
> -Add and use iosf_mbi_assert_punit_acquired() helper
> -Add missing acquire surrounding intel_uncore_forcewake_reset() inside
> intel_uncore_check_forcewake_domains()
> -Add Imre's Reviewed-by
>
> Changes in v5:
> -Separate out arch/x86 iosf_mbi changes into a separate patch
> ---
> drivers/gpu/drm/i915/intel_uncore.c | 17 +++++++++++++----
> drivers/gpu/drm/i915/selftests/intel_uncore.c | 3 +++
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 8c2ce81f01c2..0da81faf3981 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -229,6 +229,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer)
> return HRTIMER_NORESTART;
> }
>
> +/* Note callers must have acquired the PUNIT->PMIC bus, before calling this. */
> static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv,
> bool restore)
> {
> @@ -237,6 +238,8 @@ static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv,
> int retry_count = 100;
> enum forcewake_domains fw, active_domains;
>
> + iosf_mbi_assert_punit_acquired();
> +
> /* Hold uncore.lock across reset to prevent any register access
> * with forcewake not set correctly. Wait until all pending
> * timers are run before holding.
> @@ -416,14 +419,18 @@ static void __intel_uncore_early_sanitize(struct drm_i915_private *dev_priv,
> GT_FIFO_CTL_RC6_POLICY_STALL);
> }
>
> + iosf_mbi_punit_acquire();
> intel_uncore_forcewake_reset(dev_priv, restore_forcewake);
> + iosf_mbi_punit_release();
> }
>
> void intel_uncore_suspend(struct drm_i915_private *dev_priv)
> {
> - iosf_mbi_unregister_pmic_bus_access_notifier(
> + iosf_mbi_punit_acquire();
> + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(
> &dev_priv->uncore.pmic_bus_access_nb);
> intel_uncore_forcewake_reset(dev_priv, false);
> + iosf_mbi_punit_release();
> }
>
> void intel_uncore_resume_early(struct drm_i915_private *dev_priv)
> @@ -1315,12 +1322,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
>
> void intel_uncore_fini(struct drm_i915_private *dev_priv)
> {
> - iosf_mbi_unregister_pmic_bus_access_notifier(
> - &dev_priv->uncore.pmic_bus_access_nb);
> -
> /* Paranoia: make sure we have disabled everything before we exit. */
> intel_uncore_sanitize(dev_priv);
> +
> + iosf_mbi_punit_acquire();
> + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(
> + &dev_priv->uncore.pmic_bus_access_nb);
> intel_uncore_forcewake_reset(dev_priv, false);
> + iosf_mbi_punit_release();
> }
>
> static const struct reg_whitelist {
> diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> index 3cac22eb47ce..733d87fe7737 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> @@ -148,7 +148,10 @@ static int intel_uncore_check_forcewake_domains(struct drm_i915_private *dev_pri
> for_each_set_bit(offset, valid, FW_RANGE) {
> i915_reg_t reg = { offset };
>
> + iosf_mbi_punit_acquire();
> intel_uncore_forcewake_reset(dev_priv, false);
> + iosf_mbi_punit_release();
> +
> check_for_unclaimed_mmio(dev_priv);
>
> (void)I915_READ(reg);

This patch looks like one massive layering violation. Why does the GPU code muck
with the uncore hardware?

Thanks,

Ingo