Re: [PATCH] Revert "ARM: dts: exynos: Remove 'opp-shared' from Exynos4412 bus OPP-tables"

From: Krzysztof Kozlowski
Date: Tue Feb 23 2021 - 15:02:03 EST


On Tue, Feb 23, 2021 at 10:24:41AM +0100, Marek Szyprowski wrote:
> Hi Markus,
>
> On 22.02.2021 10:54, Markus Reichl wrote:
> > This reverts commit a23beead41a18c3be3ca409cb52f35bc02e601b9.
> >
> > I'm running an Odroid-X2 as headless 24/7 server.
> > With plain stable 5.10.1 I had 54 up days without problems.
> > With opp-shared removed on kernels before and now on 5.11
> > my system freezes after some days on disk activity to eMMC
> > (rsync, apt upgrade).
> >
> > The spontaneous hangs are not easy to reproduce but testing this
> > for several months now I am quite confident that there is something
> > wrong with this patch.
> >
> > Signed-off-by: Markus Reichl <m.reichl@xxxxxxxxxxxxx>
>
> Thanks for the report.
>
> IMHO a straight revert is a bad idea. I would prefer to keep current opp
> definitions and disable the affected devfreq devices (probably right bus
> would be enough) or try to identify which transitions are responsible
> for that issue. I know that it would take some time to identify them,
> but that would be the best solution. Reverting leads to incorrect
> hardware description, what in turn confuses the driver and framework,
> what in turn hides a real problem.

I agree with this approach. If devfreq is unusable on that platform,
let's try disabling the exynos-bus nodes. It could be enough to help.
The opp-shared does not look like proper fix for this problem, but
rather a incorrect solution which achieves the same result - disabling
frequency/voltage scaling on some buses.

>
> Another problem related to devfreq on Exynos4412 has been introduced
> recently by the commit 86ad9a24f21e ("PM / devfreq: Add required OPPs
> support to passive governor"). You can see lots of the messages like
> this one:
>
> devfreq soc:bus-acp: failed to update devfreq using passive governor
>
> I didn't have time to check what's wrong there, but I consider devfreq
> on Exynos a little bit broken, so another solution would be just to
> disable it in the exynos_defconfig.

Yes, I saw it as well. However defconfig is only defconfig, so customers
still would be affected and still might report bugs for it. Maybe better
to disable all exynos-bus nodes?

Best regards,
Krzysztof