Re: [linux-pm] intermittent suspend problem again

From: Ferenc Wagner
Date: Tue Nov 17 2009 - 20:13:08 EST


"Rafael J. Wysocki" <rjw@xxxxxxx> writes:

> On Saturday 14 November 2009, Ferenc Wagner wrote:
>
>> Are other pm_test values meaningful, or possibly harmful?
>
> They are supposed to work as for suspend.
>
>> I think I tried freezer, which resulted in a seemingly perfect
>> suspend, but the machine didn't try to resume afterwards, but booted
>> normally instead...
>
> So this sounds like there's a bug (will check).

I rechecked this: the freezer "test" goes on to suspend the machine;
generally it's even possible to resume from the image, but this still
should be a bug. Can you reproduce it?

Meanwhile I managed to freeze the machine in the "Snapshotting system"
phase (that is, in the SNAPSHOT_CREATE_IMAGE ioctl) again. SysRq
reacted, but didn't produce any output besides the name of the invoked
function or the help text. It couldn't power off the machine, but it
was able to reboot it.

Since I've instrumented s2disk and the hibernation path, no freeze
happened during hibernating the machine. However, it instead froze once
rebooting, when I wanted to replace the kernel. It was the usual stuff:
everything smooth until the last step, then the final syscall with the
magic constants, then silence... It's starting to look like this bug has
nothing to do with hibernation after all, it's just the shutdown method
I use most often, so it surfaced with that.

I tried various things after starting with init=/bin/bash, but I wasn't
able to cancel the suspend then, so I introduced the "always cancel"
parameter (please find my current patch queue attached). With that, I'm
able to freeze the machine in 2-5 tries: after a couple of perfect runs,
s2disk -P"always cancel=y" returns normally to the starting screen, but
I'm left with a totally unresponsive machine. If I didn't botch my
patches, this may be a trace to follow: still not 100% reproducible, but
almost.

Btw. no matter I tried setting suspend loglevel to 1 or 2, usual
unqualified printks didn't make it to the console s2disk uses (not even
ones from before suspend_console() in hibernation_platform_enter, or
even simple ioctl traces for /dev/snapshot). Would it be possible to
work around this by skipping prepare_console or similar?

And a last thing: when I set resume device to /root/strace (yes, that
binary), s2disk gave a rater strange report:
s2disk: Invalid resume device. Reason: Success.
--
Regards,
Feri.