Re: [RFC PATCH] ARM: dts: exynos: partial revert of Adjust bus related OPPs to the values correct for Exynos5422 Odroids

From: Marek Szyprowski
Date: Mon Jul 13 2020 - 09:24:00 EST


Hi Willy,

On 03.07.2020 15:20, Willy Wolff wrote:
> On Odroid XU3/4 board, since 5.6 with 1019fe2c728003f89ee11482cf8ec81dbd8f15ba,
> the network is not working properly.
>
> After properly booting, when trying to connect to the board via ssh, the board
> hang for a while and this message happen:
>
> [ 211.111967] ------------[ cut here ]------------
> [ 211.117520] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x3ac/0x3e0
> [ 211.125636] NETDEV WATCHDOG: eth0 (smsc95xx): transmit queue 0 timed out
> [ 211.132058] Modules linked in:
> [ 211.134815] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.8.0-rc3-00082-gcdd3bb54332f-dirty #1
> [ 211.143518] Hardware name: Samsung Exynos (Flattened Device Tree)
> [ 211.149458] [<c0112290>] (unwind_backtrace) from [<c010d1ac>] (show_stack+0x10/0x14)
> [ 211.157287] [<c010d1ac>] (show_stack) from [<c051b93c>] (dump_stack+0xac/0xd8)
> [ 211.164416] [<c051b93c>] (dump_stack) from [<c0127a50>] (__warn+0xd0/0x108)
> [ 211.171301] [<c0127a50>] (__warn) from [<c0127e60>] (warn_slowpath_fmt+0x94/0xb8)
> [ 211.178824] [<c0127e60>] (warn_slowpath_fmt) from [<c0929b38>] (dev_watchdog+0x3ac/0x3e0)
> [ 211.187043] [<c0929b38>] (dev_watchdog) from [<c01c791c>] (call_timer_fn+0xd4/0x420)
> [ 211.194698] [<c01c791c>] (call_timer_fn) from [<c01c86ec>] (run_timer_softirq+0x620/0x784)
> [ 211.202980] [<c01c86ec>] (run_timer_softirq) from [<c0101408>] (__do_softirq+0x1e0/0x664)
> [ 211.211123] [<c0101408>] (__do_softirq) from [<c0130924>] (irq_exit+0x158/0x16c)
> [ 211.218467] [<c0130924>] (irq_exit) from [<c01a1ef0>] (__handle_domain_irq+0x80/0xec)
> [ 211.226304] [<c01a1ef0>] (__handle_domain_irq) from [<c0536eac>] (gic_handle_irq+0x58/0x9c)
> [ 211.234626] [<c0536eac>] (gic_handle_irq) from [<c0100af0>] (__irq_svc+0x70/0xb0)
> [ 211.241982] Exception stack(0xc1101f10 to 0xc1101f58)
> [ 211.246789] 1f00: ffffffff ffffffff 00000001 0008f0bd
> [ 211.255230] 1f20: ffffe000 c1108eec c1108f30 00000001 00000000 c0df311c 00000000 c1076028
> [ 211.262303] exynos5-hsi2c 12ca0000.i2c: tx timeout
> [ 211.263351] 1f40: 00000000 c1101f60 c01097f8 c01097fc 600f0113 ffffffff
> [ 211.263649] [<c0100af0>] (__irq_svc) from [<c01097fc>] (arch_cpu_idle+0x24/0x44)
> [ 211.263771] [<c01097fc>] (arch_cpu_idle) from [<c01640c8>] (do_idle+0x214/0x2c0)
> [ 211.289414] [<c01640c8>] (do_idle) from [<c0164528>] (cpu_startup_entry+0x18/0x1c)
> [ 211.296999] [<c0164528>] (cpu_startup_entry) from [<c1000e54>] (start_kernel+0x4e8/0x520)
> [ 211.305822] irq event stamp: 585972
> [ 211.308637] hardirqs last enabled at (585984): [<c0100b0c>] __irq_svc+0x8c/0xb0
> [ 211.316470] hardirqs last disabled at (585993): [<c019ed9c>] console_unlock+0xd4/0x654
> [ 211.324282] softirqs last enabled at (585920): [<c0130640>] irq_enter_rcu+0x7c/0x84
> [ 211.332072] softirqs last disabled at (585921): [<c0130924>] irq_exit+0x158/0x16c
> [ 211.339329] ---[ end trace 5726ca773f159ae9 ]---
>
> After that, the board continue working from serial console only, but the board
> doesn't pong anymore.
>
> Reverting some change fix the issue.

Okay, I've finally found some time to analyze this. Your proposed change
simply disables devfreq for fsys-bus and it looks that it wasn't working
from the begging, due to some misunderstanding what is 'shared-opp'.

When one OPP table has such 'shared-opp' property, when one bus sets the
OPP (in our case the FSYS_APB bus), framework assumes that this means
that the OPP for the other bus that shares it is also automatically set,
so framework will not call clk_set_rate for the FSYS bus related clock.
This is okay if two busses shares OPPs and the same clock, but here we
have two busses 'sharing' OPPs, but with different clocks.

I remember that You have tries different frequencies for FSYS bus, but
in all cases it sooner or later caused USB host crash. I think that the
best way to fix this issue would be simply to remove the FSYS-bus
devfreq, as scaling its frequency breaks USB host operation on some boards.

I will prepare a patch removing FSYS bus entry with the above
description and your 'reported-by' tag.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland