Re: EIP is at device_shutdown+0x32/0x60

From: Mark Lord
Date: Thu Nov 15 2007 - 13:29:20 EST


Greg KH wrote:
On Thu, Nov 15, 2007 at 12:07:48PM -0500, Mark Lord wrote:
..
Greg, I don't know if this is relevant or not,
but x86 has bugs in the halt/reboot code for SMP.

Specifically, in native_smp_send_stop() the code now uses
spin_trylock() to "lock" the shared call buffers,
but then ignores the result.

This means that multiple CPUs can/will clobber each other
in that code.

The second bug, is that this code does not wait for the
target CPUs to actually stop before it continues.

This was the real cause of the failure-to-poweroff problems
I was having with 2.6.23, which we fixed by using CPU hotplug
to disable_nonboot_cpus() before the above code ever got run.

I have noticed that the shutdown path is quite weird, shutting down
sysdev devices differently depending on the type of shutdown, which is
probably not good.

But what change are you talking about for the poweroff problem? I have
a _lot_ of people reporting that 2.6.22 is not powering off for them and
I can't seem to figure it out. Do you have a changeset for something
that went in to fix this issue?
..

Well, the real bugs that cause the problem are described by me above.
I don't have a fix for those, but the workaround is in 2.6.23
under git 4047727e5ae33f9b8d2b7766d1994ea6e5ec2991 Fix SMP poweroff hangs.

With that workaround, there's no more hanging on halt,
though there could still be a hang on reboot.

A problem with that workaround is that it has no effect unless
the CPU hotplug code is configured (CONFIG_PM_SLEEP_SMP and pals).

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/