Re: [PATCH] pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation

From: Johan Hovold
Date: Tue Feb 27 2024 - 04:15:21 EST


On Mon, Feb 26, 2024 at 05:49:57PM -0800, Bjorn Andersson via B4 Relay wrote:
> From: Bjorn Andersson <quic_bjorande@xxxxxxxxxxx>
>
> Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable
> the domain")' aimed to make sure that a power-domain that is being
> enabled without any particular performance-state requested will at least
> turn the rail on, to avoid filling DeviceTree with otherwise unnecessary
> required-opps properties.
>
> But in the event that aggregation happens on a disabled power-domain, with
> an enabled peer without performance-state, both the local and peer
> corner are 0. The peer's enabled_corner is not considered, with the
> result that the underlying (shared) resource is disabled.
>
> One case where this can be observed is when the display stack keeps mmcx
> enabled (but without a particular performance-state vote) in order to
> access registers and sync_state happens in the rpmhpd driver. As mmcx_ao
> is flushed the state of the peer (mmcx) is not considered and mmcx_ao
> ends up turning off "mmcx.lvl" underneath mmcx. This has been observed
> several times, but has been painted over in DeviceTree by adding an
> explicit vote for the lowest non-disabled performance-state.
>
> Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain")
> Reported-by: Johan Hovold <johan@xxxxxxxxxx>
> Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@xxxxxxxxxxxxxxxxxxxx/
> Cc: <stable@xxxxxxxxxxxxxxx>
> Signed-off-by: Bjorn Andersson <quic_bjorande@xxxxxxxxxxx>
> ---
> This issue is the root cause of a display regression on SC8280XP boards,
> resulting in the system often resetting during boot. It was exposed by
> the refactoring of the DisplayPort driver in v6.8-rc1.

This fixes the hard resets I've been seeing since rc1 when initialising
the display subsystem of the Lenovo ThinkPad X13s at boot. With some
instrumentation added I can see the resets coinciding with the call to
rpmhpd_aggregate_corner() for 'mx_ao':

Tested-by: Johan Hovold <johan+linaro@xxxxxxxxxx>