Re: [tip:smp/hotplug] cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending

From: Qais Yousef
Date: Mon Mar 30 2020 - 14:34:41 EST


On 03/27/20 10:53, Boqun Feng wrote:
> Hi Thomas and Pavankumar,
>
> I have a question about this patch, please see below:
>
> On Wed, Jun 12, 2019 at 05:34:08AM -0700, tip-bot for Pavankumar Kondeti wrote:
> > Commit-ID: a66d955e910ab0e598d7a7450cbe6139f52befe7
> > Gitweb: https://git.kernel.org/tip/a66d955e910ab0e598d7a7450cbe6139f52befe7
> > Author: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>
> > AuthorDate: Mon, 3 Jun 2019 10:01:03 +0530
> > Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > CommitDate: Wed, 12 Jun 2019 11:03:05 +0200
> >
> > cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending
> >
> > When "deep" suspend is enabled, all CPUs except the primary CPU are frozen
> > via CPU hotplug one by one. After all secondary CPUs are unplugged the
> > wakeup pending condition is evaluated and if pending the suspend operation
> > is aborted and the secondary CPUs are brought up again.
> >
> > CPU hotplug is a slow operation, so it makes sense to check for wakeup
> > pending in the freezer loop before bringing down the next CPU. This
> > improves the system suspend abort latency significantly.
> >
>
> From the commit message, it makes sense to add the pm_wakeup_pending()
> check if freeze_secondary_cpus() is used for system suspend. However,
> freeze_secondary_cpus() is also used in kexec path on arm64:
>
> kernel_kexec():
> machine_shutdown():
> disable_nonboot_cpus():
> freeze_secondary_cpus()

FWIW, I fixed this already and the change was picked up:

https://lore.kernel.org/lkml/20200323135110.30522-7-qais.yousef@xxxxxxx/

Only x86 now uses disable_nonboot_cpus().

# tip/smp/core

$ git grep disable_nonboot_cpus
Documentation/power/suspend-and-cpuhotplug.rst: disable_nonboot_cpus()
Documentation/power/suspend-and-cpuhotplug.rst: /* disable_nonboot_cpus() complete */
arch/x86/power/cpu.c: ret = disable_nonboot_cpus();
include/linux/cpu.h:static inline int disable_nonboot_cpus(void)
include/linux/cpu.h:static inline int disable_nonboot_cpus(void) { return 0; }
kernel/cpu.c: * this even in case of failure as all disable_nonboot_cpus() users are

Thanks

--
Qais Yousef

>
> , so I wonder whether the pm_wakeup_pending() makes sense in this
> situation? Because IIUC, in this case we want to reboot the system
> regardlessly, the pm_wakeup_pending() checking seems to be inappropriate
> then.
>
> I'm asking this because I'm debugging a kexec failure on ARM64 guest on
> Hyper-V, and I got the BUG_ON() triggered:
>
> [ 108.378016] kexec_core: Starting new kernel
> [ 108.378018] Disabling non-boot CPUs ...
> [ 108.378019] Wakeup pending. Abort CPU freeze
> [ 108.378020] Non-boot CPUs are not disabled
> [ 108.378049] ------------[ cut here ]------------
> [ 108.378050] kernel BUG at arch/arm64/kernel/machine_kexec.c:154!
>
> Thanks!
>
> Regards,
> Boqun
>
> > [ tglx: Massaged changelog and improved printk message ]
> >
> > Signed-off-by: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>
> > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> > Cc: Len Brown <len.brown@xxxxxxxxx>
> > Cc: Pavel Machek <pavel@xxxxxx>
> > Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > Cc: iri Kosina <jkosina@xxxxxxx>
> > Cc: Mukesh Ojha <mojha@xxxxxxxxxxxxxx>
> > Cc: linux-pm@xxxxxxxxxxxxxxx
> > Link: https://lkml.kernel.org/r/1559536263-16472-1-git-send-email-pkondeti@xxxxxxxxxxxxxx
> >
> > ---
> > kernel/cpu.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index be82cbc11a8a..0778249cd49d 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -1221,6 +1221,13 @@ int freeze_secondary_cpus(int primary)
> > for_each_online_cpu(cpu) {
> > if (cpu == primary)
> > continue;
> > +
> > + if (pm_wakeup_pending()) {
> > + pr_info("Wakeup pending. Abort CPU freeze\n");
> > + error = -EBUSY;
> > + break;
> > + }
> > +
> > trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
> > error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
> > trace_suspend_resume(TPS("CPU_OFF"), cpu, false);