Re: [PATCH 2/3] arm64: dts: qcom: sa8775p: Add CPU OPP tables to scale DDR/L3
From: Jagadeesh Kona
Date: Wed Dec 04 2024 - 03:45:52 EST
On 12/4/2024 8:43 AM, Dmitry Baryshkov wrote:
> On Tue, Dec 03, 2024 at 08:33:46PM +0530, Jagadeesh Kona wrote:
>>
>>
>> On 11/30/2024 8:02 PM, Konrad Dybcio wrote:
>>> On 14.11.2024 11:48 PM, Dmitry Baryshkov wrote:
>>>> On Mon, Nov 11, 2024 at 06:39:48PM +0530, Jagadeesh Kona wrote:
>>>>>
>>>>>
>>>>> On 10/17/2024 9:12 PM, Brian Masney wrote:
>>>>>> On Thu, Oct 17, 2024 at 02:58:31PM +0530, Jagadeesh Kona wrote:
>>>>>>> + cpu0_opp_table: opp-table-cpu0 {
>>>>>>> + compatible = "operating-points-v2";
>>>>>>> + opp-shared;
>>>>>>> +
>>>>>>> + cpu0_opp_1267mhz: opp-1267200000 {
>>>>>>> + opp-hz = /bits/ 64 <1267200000>;
>>>>>>> + opp-peak-kBps = <6220800 29491200>;
>>>>>>> + };
>>>>>>> +
>>>>>>> + cpu0_opp_1363mhz: opp-1363200000 {
>>>>>>> + opp-hz = /bits/ 64 <1363200000>;
>>>>>>> + opp-peak-kBps = <6220800 29491200>;
>>>>>>> + };
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>>> + cpu4_opp_table: opp-table-cpu4 {
>>>>>>> + compatible = "operating-points-v2";
>>>>>>> + opp-shared;
>>>>>>> +
>>>>>>> + cpu4_opp_1267mhz: opp-1267200000 {
>>>>>>> + opp-hz = /bits/ 64 <1267200000>;
>>>>>>> + opp-peak-kBps = <6220800 29491200>;
>>>>>>> + };
>>>>>>> +
>>>>>>> + cpu4_opp_1363mhz: opp-1363200000 {
>>>>>>> + opp-hz = /bits/ 64 <1363200000>;
>>>>>>> + opp-peak-kBps = <6220800 29491200>;
>>>>>>> + };
>>>>>>
>>>>>> There's no functional differences in the cpu0 and cpu4 opp tables. Can
>>>>>> a single table be used?
>>>>>>
>>>>>> This aligns with my recollection that this particular SoC only has the
>>>>>> gold cores.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>
>>>>> Thanks Brian for your review. Sorry for the delayed response.
>>>>>
>>>>> We require separate OPP tables for CPU0 and CPU4 to allow independent
>>>>> scaling of DDR and L3 frequencies for each CPU domain, with the final
>>>>> DDR and L3 frequencies being an aggregate of both.
>>>>>
>>>>> If we use a single OPP table for both CPU domains, then _allocate_opp_table() [1]
>>>>> won't be invoked for CPU4. As a result both CPU devices will end up in sharing
>>>>> the same ICC path handle, which could lead to one CPU device overwriting the bandwidth
>>>>> votes of other.
>>>
>>> Oh that's a fun find.. clocks get the same treatment.. very bad,
>>> but may explain some schroedingerbugs.
>>>
>>> Taking a peek at some code paths, wouldn't dropping opp-shared
>>> solve our issues? dev_pm_opp_set_sharing_cpus() overrides it
>>>
>>> Konrad
>>
>> Thanks Konrad for your review.
>>
>> Yes, correct. I tried dropping opp-shared but it is again getting set due to
>> dev_pm_opp_set_sharing_cpus().
>
> It should be set, but then it should get the limited CPU mask rather
> than the full CPU set. Isn't that enough for your case?
>
Even if we call dev_pm_opp_set_sharing_cpus() with the limited CPU mask, it adds
OPP_TABLE_ACCESS_SHARED flag to the OPP table. Due to this flag being set, if this
same opp table is used for another CPU domain(CPU4-7) also in DT, then _managed_opp[1]
which gets called inside from dev_pm_opp_of_add_table() for CPU4 will return the same
CPU0 OPP table.
Due to above, _allocate_opp_table() [2] won't be invoked for CPU4 but instead CPU4 will be
added as device under the CPU0 OPP table [3]. Due to this, dev_pm_opp_of_find_icc_paths() [4]
won't be invoked for CPU4 device and hence CPU4 won't be able to independently scale it's
interconnects. Both CPU0 and CPU4 devices will scale the same ICC path which can lead to one
device overwriting the BW vote placed by other device. So we need two separate OPP tables for
both domains.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/opp/core.c#n1600
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/opp/core.c#n1613
[3] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/opp/core.c#n1606
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/opp/core.c#n1484
Thanks,
Jagadeesh