Re: [RFC PATCH V2] PM/CPU: Parallel enalbing nonboot cpus with resume devices

From: Rafael J. Wysocki
Date: Fri Aug 29 2014 - 20:22:34 EST


On 8/29/2014 5:40 AM, Lan Tianyu wrote:
On 2014å08æ22æ 16:33, Lan Tianyu wrote:
In the current world, all nonboot cpus are enabled serially during system
resume. System resume sequence is that boot cpu enables nonboot cpu one by
one and then resume devices. Before resuming devices, there are few tasks
assigned to nonboot cpus after they are brought up. This waste cpu usage.

To accelerate S3, this patches allows boot cpu to go forward to resume
devices after bringing up one nonboot cpu. The nonboot cpu will be in
charge of bringing up other cpus. This makes enabling cpu2~x parallel
with resuming devices.

Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx>
---
Change since V1:
Remove PM_PARALLEL_CPU_UP_FOR_SUSPEND kernel config and make
paralleling cpu as default behaviour. Add error handling for
failure of the first frozen cpu online.

So far, I just tested the patch on the Intel machines. It's better
to test it on the others Arch platforms. Appreciate a lot if some
one can help test it.

Hi All:
Any comments on this patch? Thanks.

You need to ensure that the async thing completes before cpufreq_resume() or bad things will happen I think.

kernel/cpu.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 12 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index a343bde..9bc8497 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -551,8 +551,42 @@ void __weak arch_enable_nonboot_cpus_end(void)
{
}
+static int _cpu_up_with_trace(int cpu)

Better name?

+{
+ int error;
+
+ trace_suspend_resume(TPS("CPU_ON"), cpu, true);
+ error = _cpu_up(cpu, 1);
+ trace_suspend_resume(TPS("CPU_ON"), cpu, false);
+ if (error) {
+ pr_warn("Error taking CPU%d up: %d\n", cpu, error);
+ return error;
+ }
+
+ pr_info("CPU%d is up\n", cpu);
+ return 0;
+}
+
+static int async_enable_nonboot_cpus(void *data)
+{
+ int cpu;
+
+ cpu_maps_update_begin();
+ arch_enable_nonboot_cpus_begin();

Shouldn't you call this before trying to bring up the first one?

+
+ for_each_cpu(cpu, frozen_cpus) {
+ _cpu_up_with_trace(cpu);
+ }
+
+ arch_enable_nonboot_cpus_end();
+ cpumask_clear(frozen_cpus);
+ cpu_maps_update_done();
+ return 0;
+}
+
void __ref enable_nonboot_cpus(void)
{
+ struct task_struct *tsk;
int cpu, error;
/* Allow everyone to use the CPU hotplug again */
@@ -563,22 +597,34 @@ void __ref enable_nonboot_cpus(void)
pr_info("Enabling non-boot CPUs ...\n");
- arch_enable_nonboot_cpus_begin();
+ cpu = cpumask_first(frozen_cpus);
+ cpumask_clear_cpu(cpu, frozen_cpus);
- for_each_cpu(cpu, frozen_cpus) {
- trace_suspend_resume(TPS("CPU_ON"), cpu, true);
- error = _cpu_up(cpu, 1);
- trace_suspend_resume(TPS("CPU_ON"), cpu, false);
- if (!error) {
- pr_info("CPU%d is up\n", cpu);
- continue;
+ error = _cpu_up_with_trace(cpu);
+ if (cpumask_empty(frozen_cpus))
+ goto out;
+
+ if (error) {
+ /*
+ * If fail to bring up the first frozen cpus,
+ * enable the rest frozen cpus on the boot cpu.
+ */
+ arch_enable_nonboot_cpus_begin();
+ for_each_cpu(cpu, frozen_cpus) {
+ _cpu_up_with_trace(cpu);
}
- pr_warn("Error taking CPU%d up: %d\n", cpu, error);
- }
+ arch_enable_nonboot_cpus_end();
- arch_enable_nonboot_cpus_end();
+ } else {
+ tsk = kthread_create_on_cpu(async_enable_nonboot_cpus,
+ NULL, cpu, "async-enable-nonboot-cpus");

Does it really need to run on the other CPU? If the idea is to avoid waiting mostly, the async thread can start running on the boot CPU just fine I suppose.

+ if (IS_ERR(tsk)) {
+ pr_err("Failed to create async enable nonboot cpus thread.\n");
+ goto out;
+ }
- cpumask_clear(frozen_cpus);
+ kthread_unpark(tsk);
+ }
out:
cpu_maps_update_done();
}



Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/