[PATCH] power: suspend: Add suspend timeout handler

From: josephjang
Date: Tue Oct 20 2020 - 05:01:08 EST


On Tue, Oct 20, 2020 at 08:15:38AM +0000, josephjang@xxxxxxxxxx wrote:
> > On Tue, Oct 20, 2020 at 02:22:26PM +0800, Joseph Jang wrote:
> > > Add sleep timer and timeout handler to prevent device stuck during
> > suspend/
> > > resume process. The timeout handler will dump disk sleep task at first
> > > round timeout and trigger kernel panic at second round timeout.
> > > The default timer for each round is defined in
> > > CONFIG_PM_SLEEP_TIMER_TIMEOUT.
> > >
> > > Signed-off-by: Joseph Jang <josephjang@xxxxxxxxxx>
> > > ---
> > > MAINTAINERS | 2 +
> > > include/linux/console.h | 1 +
> > > include/linux/suspend_timer.h | 90 +++++++++++++++++++++++++++++++++++
>
> > Why is this file in include/linux/ if you only ever call it from one .c
> > file?
>
> I just refer to include/linux/suspend.h and create a new header file in the
> same folder.
> If you have a better location for the new header file, please feel free to
> let me know.

Only put .h files that are needed by different .c files in the
include/linux/ directory. Otherwise it should be local to where the .c
file is.
Great, use that!

> But we really hit the suspend hang issue that DPM_WATCHDOG cannot cover.

What issue is that?

> We propose a wide coverage debug feature like PM_SLEEP_MONITOR which
> not only covers PM but also core PM hang issues.
>
> And DPM_WATCHDOG is for device driver power management in
> drivers/base/power/main.c
> and PM_SLEEP_MONITOR locate is for core power management in
> kernel/power/suspend.c.
> I think it is fine for users to select whether they need device PM only or
> not.

How will a user know which they should use?

Why not just fix whatever is wrong with the watchdog code instead of
creating a new one?

> > And why isn't the watchdog sufficient for you? Why are you "open
> > coding" a watchdog timer logic here at all???
>
> Yes, we refer to DPM_WATCHDOG to extend the watchdog debugging for core PM.
> Because we really hit a real case that was not covered by DPM_WATCHDOG.

Then fix that!

> I think PM_SLEEP_MONITOR is an extension debug feature from DPM_WATCHDOG.

Please just fix the watchdog, as obviously it is not working properly.
Don't create something new because of that.

thanks,

greg k-h

Thank you Greq for promptly responding.
I am okay to fix the DPM_WATCHDOG feature, but would like to have quick sync up before start.
Could you help?


1. Can we change the kernel config name ?
DPM_WATCHDOG stands for Device Power Management.
We propose PM_SLEEP_MONITOR is to cover Core PM and Device PM.


2. Can we create a new data structure instead of using the following struct dpm_watchdog?
struct dpm_watchdog {
struct device *dev;
struct task_struct *tsk;
struct timer_list timer;
};

I list some reasons by following ...

static int device_resume(struct device *dev, pm_message_t state, bool async)
{
pm_callback_t callback = NULL;
const char *info = NULL;
int error = 0;
DECLARE_DPM_WATCHDOG_ON_STACK(wd); <== dpm_watchdog use stack memory for watchdog timer struct, but sleep timer use global memory.

...<SNIP>

if (!dpm_wait_for_superior(dev, async))
goto Complete;

dpm_watchdog_set(&wd, dev); <== dpm_watchdog need "struct device", but sleep timer doesn't need it.
device_lock(dev);

...<SNIP>

Unlock:
device_unlock(dev);
dpm_watchdog_clear(&wd);

Complete:
complete_all(&dev->power.completion);

TRACE_RESUME(error);

return error;
}

Thank you,
Joseph.