[PATCH v2] power: suspend: Replace dpm_watchdog by sleep timer

From: josephjang
Date: Wed Oct 21 2020 - 08:01:37 EST


> On Wed, Oct 21, 2020 at 4:09 AM Joseph Jang <josephjang@xxxxxxxxxx> wrote:
> >
> > Since dpm_watchdog just cover device power management, we proposed sleep
> > timer to cover not only device power management hang issues, but also
> > core power management hand issue.
> >
> > Add sleep timer and timeout handler to prevent device stuck during suspend/
> > resume process. The timeout handler will dump disk sleep task at first
> > round timeout and trigger kernel panic at second round timeout.
> > The default timer for each round is defined in
> > CONFIG_PM_SLEEP_TIMER_TIMEOUT.
>
> Let me repeat the point that Greg has made already: Please don't
> replace the existing watchdog with something else, but try to extend
> it to cover your use case.
>
> As it stands, the patch is not applicable IMV.
>
> Thanks!

Thank you Rafael's promptly response.
Since dmp_watchdog was implement in drivers/base/power/main.c and it use stack memory for timer struct.
It is a little bit hard to use its API in kernel/power/suspend.c.
And from the call flow, we prefer to start/stop watchog timer in suspend.c only. It will also cover device
power management as well.

If you don't want to change Kconfig, I can keep original DPM_WATCHDOG.
But it is not easy to extend driver base API into core power base. Am I correct?

If you have some suggestion, please feel free to let me know, I will follow up.

Thank you,
Joseph.


Let me try to explain why cannot use current dpm_watchdog API, because it use stack memory and has limitation.

/**
* dpm_watchdog_set - Enable pm watchdog for given device.
* @wd: Watchdog. Must be allocated on the stack.
* @dev: Device to handle.
*/
static void dpm_watchdog_set(struct dpm_watchdog *wd, struct device *dev)
{
struct timer_list *timer = &wd->timer;

wd->dev = dev;
wd->tsk = current;

timer_setup_on_stack(timer, dpm_watchdog_handler, 0); <== setup timer on stack memory ....
/* use same timeout value for both suspend and resume */
timer->expires = jiffies + HZ * CONFIG_DPM_WATCHDOG_TIMEOUT;
add_timer(timer);
}


If you could take a look at my patch again, you can see we need to start watchdog at end of function like s2idle_enter().
We cannot use stack memory for timer because it will be free when go back to caller.
That is why I am creating new API to start/stop watchdog timer which use global memory.

/**
* pm_suspend_default_s2idle - Check if suspend-to-idle is the default suspend.
*
@@ -89,6 +92,8 @@ static void s2idle_enter(void)
{
trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, true);

+ stop_sleep_timer(&st);
+
raw_spin_lock_irq(&s2idle_lock);
if (pm_wakeup_pending())
goto out;
@@ -114,6 +119,8 @@ static void s2idle_enter(void)
s2idle_state = S2IDLE_STATE_NONE;
raw_spin_unlock_irq(&s2idle_lock);

+ start_sleep_timer(&st);
+
trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, false);
}


Base on that, could you help to review if we could create new API to start/stop watchdog timer for core power management?

Thank you,
Joseph.