Linux 3.1.5/6 regression: fails to resume from suspend (bisected)

From: Philip Langdale
Date: Tue Dec 27 2011 - 00:52:58 EST


Hi,

After upgrading to 3.1.5, and still in 3.1.6, I found myself unable to
resume from suspend. I did a bisect and identified the following change
as the cause:

commit aeed6baa702a285cf03b7dc4182ffc1a7f4e4ed6
Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Fri Dec 2 16:02:45 2011 +0100

clockevents: Set noop handler in clockevents_exchange_device()

commit de28f25e8244c7353abed8de0c7792f5f883588c upstream.

If a device is shutdown, then there might be a pending interrupt,
which will be processed after we reenable interrupts, which causes
the original handler to be run. If the old handler is the
(broadcast) periodic handler the shutdown state might hang the
kernel completely.
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index e4c699d..13dfaab 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -286,6 +286,7 @@ void clockevents_exchange_device(struct
clock_event_device *old,
* released list and do a notify add later.
*/
if (old) {
+ old->event_handler = clockevents_handle_noop;
clockevents_set_mode(old, CLOCK_EVT_MODE_UNUSED);
list_del(&old->list);
list_add(&old->list, &clockevents_released);

If I undo this change in my 3.1.6 tree, I am then able to resume as
before.

This was also reported upstream at fedora but was not fully
diagnosed:

https://bugzilla.redhat.com/show_bug.cgi?id=767248

I wouldn't be surprised if it's related to the nvidia binary drivers
in some way (I use them and so does the fedora bug reporter), but it's
not practical to avoid them.

--phil
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/