Bill Davidsen wrote:And resume is not the the bootloader's job... if memory and registers are restored, and a jump is made to the resume address, a resumed system should result. clearly some part of that didn't happen :-(Anyway, I pulled the plug on the UPS, and the system shut down. But when
it powered up, it booted the default kernel rather than the test kernel,
decided that it couldn't resume, and then did a cold boot.
Booting the machine isn't the kernel's job, it's the bootloader's job.
If the mainline resume is depending on that no wonder resume is so fragile. User action can change order of module loads, kmalloc calls move allocated structures, etc. Counting on anything to be locked in place seems naive.I can bypass this by making the debug kernel the default, but WHY? Is
the kernel not saved such that any kernel can be rolled back into memory
and run? Actually, the answer is HELL NO, so I really ask if this is the
intended mode of operation, that only the default boot kernel will restore.
Yes.
It is very dangerous to attempt a resume with a different kernel than the one
that has gone to sleep.
Different kernels may be compiled with different options that affect where or
how in-memory structures are saved.
So you suspend with a kernel which holds your filesystem data/cache/inodes atI would hope that the data used by the resumed kernel would be the same data that was suspended, not something from another kernel.
0x1234000 and restore with a kernel that expects to see your filesystem data at
0x1235000.
Ouch.
Personally I think the kernel suspend should write a signature - similar to aSomeone else dropped a note saying the FC kernels use suspend2, and work fine. I'm off to look at the FC source and see if that's the case. That would explain why suspend works and resume doesn't, hopefully there's a 2.6.21 suspend2 patch in that case.
hash of the bzImage - into the suspend image so it won't even attempt a resume
if there's a mismatch. (Yes, I made this mistake once whilst playing with suspend).