Re: SMP poweroff hangs: it's baaaack! But on x86_64 this time.
From: Pavel Machek
Date: Fri Dec 19 2008 - 06:59:44 EST
On Wed 2008-12-17 10:48:02, Mark Lord wrote:
>> Subject: Fix SMP poweroff hangs
>> From: Mark Lord <lkml@xxxxxx>
>>
>> We need to disable all CPUs other than the boot CPU (usually 0) before
>> attempting to power-off modern SMP machines. This fixes the
>> hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
>> new toybox.
>>
>> Signed-off-by: Mark Lord <mlord@xxxxxxxxx>
>> Acked-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: "Rafael J. Wysocki" <rjw@xxxxxxx>
>> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> ---
>>
>> kernel/sys.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff -puN kernel/sys.c~fix-smp-poweroff-hangs kernel/sys.c
>> --- a/kernel/sys.c~fix-smp-poweroff-hangs
>> +++ a/kernel/sys.c
>> @@ -32,6 +32,7 @@
>> #include <linux/getcpu.h>
>> #include <linux/task_io_accounting_ops.h>
>> #include <linux/seccomp.h>
>> +#include <linux/cpu.h>
>> #include <linux/compat.h>
>> #include <linux/syscalls.h>
>> @@ -878,6 +879,7 @@ void kernel_power_off(void)
>> kernel_shutdown_prepare(SYSTEM_POWER_OFF);
>> if (pm_power_off_prepare)
>> pm_power_off_prepare();
>> + disable_nonboot_cpus();
>> sysdev_shutdown();
>> printk(KERN_EMERG "Power down.\n");
>> machine_power_off();
> ..
>
> This bug has returned here now, but on x86_86 this time around.
> Same machine as before, just upgraded toa 64-bit kernel/user (2.6.27.9)
> from the original 32-bit kernel/user that was originally fixed (above).
>
> One hang at poweroff over the past 10 days. Not much, but enough
> to destroy confidence in "unattended" operation.
>
> I lack opportunity to dig further into the code for now,
> but just wanted to flag the problem, in case similar reports
> from others might already be out there.
>
> In the meanwhile, I'm experimenting with this simple patch,
> garnered from the 32-bit investigations last time around.
> We should know in a few weeks whether it has any effect or not.
>
> --- old/kernel/sys.c 2008-10-18 13:57:22.000000000 -0400
> +++ linux-2.6.27.9/kernel/sys.c 2008-12-17 09:42:17.000000000 -0500
> @@ -303,6 +303,8 @@
>
> static void kernel_shutdown_prepare(enum system_states state)
> {
> + set_cpus_allowed(current, cpumask_of_cpu(first_cpu(cpu_online_map)));
Is this line neccessary?
> + disable_nonboot_cpus();
> blocking_notifier_call_chain(&reboot_notifier_list,
> (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL);
> system_state = state;
> @@ -333,7 +335,6 @@
> kernel_shutdown_prepare(SYSTEM_POWER_OFF);
> if (pm_power_off_prepare)
> pm_power_off_prepare();
> - disable_nonboot_cpus();
> sysdev_shutdown();
> printk(KERN_EMERG "Power down.\n");
> machine_power_off();
Do you have any idea why it helps? BIOS will see us shutting down on
cpu0 anyway, so if this helps there's a linux bug somewhere...
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/