RE: [PATCH] soc: qcom: llcc: Fix dis_cap_alloc and retain_on_pc configuration

From: Stephen Boyd
Date: Thu Nov 09 2023 - 15:34:09 EST


Quoting Atul Dhudase (QUIC) (2023-11-07 07:27:54)
> Hi,
>
> On 11/7/2023 6:46 PM, Mukesh Ojha wrote:
> > On 11/7/2023 3:25 AM, Stephen Boyd wrote:
> > > Quoting Mukesh Ojha (2023-11-05 22:54:28)
> > >>
> > >>
> > >> On 11/4/2023 1:03 AM, Bjorn Andersson wrote:
> > >>> On Fri, Nov 03, 2023 at 04:27:12PM +0530, Atul Dhudase wrote:
> > >>>> While programming dis_cap_alloc and retain_on_pc, set a bit
> > >>>> corresponding to a specific SCID without disturbing the
> > >>>> previously configured bits.
> > >>>>
> > >>>
> > >>> As far as I can see, the only invocation of
> > >>> _qcom_llcc_cfg_program() comes from qcom_llcc_cfg_program(), which
> > >>> is only called once, from qcom_llcc_probe(), and here also seems
> > >>> to only be the single write to these two registers.
> > >>
> > >> It does not look to be single write but the write is for each slice
> > >> in the same register which was overriding other slices values.
> > >
> > > Can you add that detail to the commit text? What's the seriousness
> > > of the issue? Why should it be backported to stable? Is something
> > > seriously broken because a slice configuration is overwritten? Does
> > > it mean that some allocation made in a slice is being lost over
> > > power collapse (pc) when it shouldn't be?
> >
> > @Atul will update the commit text as per suggestion.
> >
> > And yes, without this change, retention feature will not work properly.
> >
> > -Mukesh
>
> Does this commit text make sense? If so, I will update patch accordingly.
>
> When configuring capacity based allocation and power collapse retention, writing to the same register for each slice caused overwriting of the values for other slices, leading to misconfiguration for majority of the slices.
> To address this, only modify the bits associated with each slice while retaining the values of the remaining bits, as they were prior to the Linux configuration.

This commit text doesn't say what, if anything, is broken. Does it save
power when it didn't before? Why is it critical to backport this to
stable kernels? Was the driver overwriting other bits in this register
that were critical to power, performance, or correctness? Or was
everything working out because the last slice to be written was the only
important one?