Re: [PATCH RESEND v8 09/10] watchdog: rzg2l_wdt: Power on the PM domain in rzg2l_wdt_restart()
From: claudiu beznea
Date: Mon May 20 2024 - 07:55:40 EST
Hi, Ulf,
On 29.04.2024 17:19, Ulf Hansson wrote:
> On Wed, 24 Apr 2024 at 13:14, claudiu beznea <claudiu.beznea@xxxxxxxxx> wrote:
>>
>> Hi, Ulf,
>>
>> On 12.04.2024 17:02, claudiu beznea wrote:
>>> Hi, Ulf,
>>>
>>> On 12.04.2024 14:14, Ulf Hansson wrote:
>>>> On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@xxxxxxxxx> wrote:
>>>>>
>>>>> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx>
>>>>>
>>>>> The rzg2l_wdt_restart() is called from atomic context. Calling
>>>>> pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume
>>>>> APIs is not an option as it may lead to issues as described in commit
>>>>> e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'")
>>>>> that removed the pm_runtime_get_sync() and used directly the
>>>>> clk_prepare_enable() APIs.
>>>>>
>>>>> Starting with RZ/G3S the watchdog could be part of its own software
>>>>> controlled power domain (see the initial implementation in Link section).
>>>>> In case the watchdog is not used the power domain is off and accessing
>>>>> watchdog registers leads to aborts.
>>>>>
>>>>> To solve this the patch powers on the power domain using
>>>>> dev_pm_genpd_resume() API before enabling its clock. This is not
>>>>> sleeping or taking any other locks as the power domain will not be
>>>>> registered with GENPD_FLAG_IRQ_SAFE flags.
>>>>>
>>>>> Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@xxxxxxxxxxxxxx
>>>>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx>
>>>>> ---
>>>>>
>>>>> Changes in v8:
>>>>> - none, this patch is new
>>>>>
>>>>> drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++
>>>>> 1 file changed, 12 insertions(+)
>>>>>
>>>>> diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c
>>>>> index c8c20cfb97a3..98e5e9914a5d 100644
>>>>> --- a/drivers/watchdog/rzg2l_wdt.c
>>>>> +++ b/drivers/watchdog/rzg2l_wdt.c
>>>>> @@ -12,6 +12,7 @@
>>>>> #include <linux/module.h>
>>>>> #include <linux/of.h>
>>>>> #include <linux/platform_device.h>
>>>>> +#include <linux/pm_domain.h>
>>>>> #include <linux/pm_runtime.h>
>>>>> #include <linux/reset.h>
>>>>> #include <linux/units.h>
>>>>> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev,
>>>>> struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev);
>>>>> int ret;
>>>>>
>>>>> + /*
>>>>> + * The device may be part of a power domain that is currently
>>>>> + * powered off. We need to power it up before accessing registers.
>>>>> + * We don't undo the dev_pm_genpd_resume() as the device need to
>>>>> + * be up for the reboot to happen. Also, as we are in atomic context
>>>>> + * here there is no need to increment PM runtime usage counter
>>>>> + * (to make sure pm_runtime_active() doesn't return wrong code).
>>>>> + */
>>>>> + if (!pm_runtime_active(wdev->parent))
>>>>> + dev_pm_genpd_resume(wdev->parent);
>>>>> +
>>>>
>>>> I doubt this is the correct solution, but I may be wrong. Unless this
>>>> is invoked at the syscore stage?
>>>
>>> On my case I see it invoked from kernel_restart(). As of my code reading,
>>
>> With the above explanations, do you consider calling dev_pm_genpd_resume()
>> here is still wrong?
>
> Yes. At least, those genpd functions were not added to cope for cases like this.
Sorry to bother you, do you have some suggestions on this topic?
On my side I did some investigation to see how else it could be implemented
but I don't have much clue how to go forward.
Would you prefer to have a separate API to deal with domain power on in
this scenario? Maybe one that should run only in the reboot context?
Would you consider only updating the description of dev_pm_genpd_resume()
and genpd_sync_power_on() to specify that it could run in a reboot context?
Would you consider updating the genpd_switch_state() to take into system
reboot state and do locking based on that, too?
>
> Moreover, you still need to find another solution as
> clk_prepare_enable() can't be called in this path.
The clock driver doesn't implement clk_ops::prepare in all
micro-architectures that this watchdog driver is used. This may be the
reason the clk_prepare_enable() was used on this path from the beginning.
Even though, a simple solution I have in mind for this is to keep the clk
prepared all the time.
Thank you,
Claudiu Beznea
>
>>
>> Do you have any suggestions I could try?
>
> Not at the moment, but I will try to circle back to this topic more
> thinking next week, when I have some more time.
>
>>
>> Thank you,
>> Claudiu Beznea
>
> Kind regards
> Uffe
>
>>
>>> at that point only one CPU is active with IRQs disabled (done in
>>> machine_restart()). Below is the stack trace decoded on next-20240410 with
>>> this series
>>> (https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/)
>>> on top and the one from here (adding power domain support):
>>> https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/
>>>
>>> Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT)
>>> Call trace:
>>> dump_backtrace (arch/arm64/kernel/stacktrace.c:319)
>>> show_stack (arch/arm64/kernel/stacktrace.c:326)
>>> dump_stack_lvl (lib/dump_stack.c:117)
>>> dump_stack (lib/dump_stack.c:124)
>>> rzg2l_wdt_restart (drivers/watchdog/rzg2l_wdt.c:180)
>>> watchdog_restart_notifier (drivers/watchdog/watchdog_core.c:188)
>>> atomic_notifier_call_chain (kernel/notifier.c:98 kernel/notifier.c:231)
>>> do_kernel_restart (kernel/reboot.c:236)
>>> machine_restart (arch/arm64/kernel/process.c:145)
>>> kernel_restart (kernel/reboot.c:287)
>>> __do_sys_reboot (kernel/reboot.c:755)
>>> __arm64_sys_reboot (kernel/reboot.c:715)
>>> invoke_syscall (arch/arm64/include/asm/current.h:19
>>> arch/arm64/kernel/syscall.c:53)
>>> el0_svc_common.constprop.0 (include/linux/thread_info.h:127
>>> arch/arm64/kernel/syscall.c:141)
>>> do_el0_svc (arch/arm64/kernel/syscall.c:153)
>>> el0_svc (arch/arm64/include/asm/irqflags.h:56
>>> arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165
>>> arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
>>> el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731)
>>> el0t_64_sync (arch/arm64/kernel/entry.S:598)
>>>
>>> The watchdog restart handler is added in restart_handler_list and this list
>>> is invoked though do_kernel_restart(). As of my code investigation the
>>> restart_handler_list is invoked only though do_kernel_restart() and only
>>> though the stack trace above.
>>>
>>> Thank you,
>>> Claudiu Beznea
>>>
>>>>
>>>>> clk_prepare_enable(priv->pclk);
>>>>> clk_prepare_enable(priv->osc_clk);
>>>>>
>>>>> --
>>>>> 2.39.2
>>>>>
>>>>>
>>>>
>>>> Can you redirectly me to the complete series, so I can have a better
>>>> overview of the problem?
>>>
>>> This is the series that adds power domain support for RZ/G3S SoC:
>>> https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/
>>>
>>> This is the series that adds watchdog support for RZ/G3S SoC:
>>> https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/
>>>
>>> Thank you for your review,
>>> Claudiu Beznea
>>>
>>>>
>>>> Kind regards
>>>> Uffe