Re: [PATCH][v11] PM / hibernate: Verify the consistent of e820 memory map by md5 digest

From: joeyli
Date: Fri Oct 07 2016 - 12:32:05 EST


Hi Chen Yu,

On Sun, Sep 25, 2016 at 12:17:57PM +0800, Chen Yu wrote:
> On some platforms, there is occasional panic triggered when trying to
> resume from hibernation, a typical panic looks like:
>
> "BUG: unable to handle kernel paging request at ffff880085894000
> IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70"
>
> Investigation carried out by Lee Chun-Yi shows that this is because
> e820 map has been changed by BIOS across hibernation, and one
> of the page frames from suspend kernel is right located in restore
> kernel's unmapped region, so panic comes out when accessing unmapped
> kernel address.
>

Sorry for finally I can not find the issue machine back now. So I add
a patch to fool kernel as the e820 changed when S4 resume for testing.

> In order to expose this issue earlier, the md5 hash of e820 map
> is passed from suspend kernel to restore kernel, and the restore
> kernel will terminate the resume process once it finds the md5
> hash are not the same.
>
[...snip]
> ---
> arch/x86/power/hibernate_64.c | 92 ++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 90 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> index 9634557..d81b1af 100644
> --- a/arch/x86/power/hibernate_64.c
> +++ b/arch/x86/power/hibernate_64.c
> @@ -11,6 +11,10 @@
> #include <linux/gfp.h>
> #include <linux/smp.h>
> #include <linux/suspend.h>
> +#include <linux/scatterlist.h>
> +#include <linux/kdebug.h>

[...snip]

> @@ -216,5 +297,12 @@ int arch_hibernation_header_restore(void *addr)
> restore_jump_address = rdr->jump_address;
> jump_address_phys = rdr->jump_address_phys;
> restore_cr3 = rdr->cr3;
> - return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL;
> +
> + if (rdr->magic != RESTORE_MAGIC)
> + return -EINVAL;
> +
> + if (hibernation_e820_mismatch(rdr->e820_digest))
> + return -ENODEV;
> +
> + return 0;
> }
> --

Because the check_image_kernel() function doesn't check the return error,
kernel only shows "PM: Image mismatch: architecture specific data". The
message covered two different fail reason.

I suggest that it prints out a log like the restore function in ARM64
architecture. Something like this, please feel free to modify the
wording:

Index: linux/arch/x86/power/hibernate_64.c
===================================================================
--- linux.orig/arch/x86/power/hibernate_64.c
+++ linux/arch/x86/power/hibernate_64.c
@@ -298,11 +298,16 @@ int arch_hibernation_header_restore(void
jump_address_phys = rdr->jump_address_phys;
restore_cr3 = rdr->cr3;

- if (rdr->magic != RESTORE_MAGIC)
+
+ if (rdr->magic != RESTORE_MAGIC) {
+ pr_crit("Hibernate image not generated by this kernel!\n");
return -EINVAL;
+ }

- if (hibernation_e820_mismatch(rdr->e820_digest))
+ if (hibernation_e820_mismatch(rdr->e820_digest)) {
+ pr_crit("The e820 saved regions changed!\n");
return -ENODEV;
+ }

return 0;
}

Other parts in your patch are good to me.


Thanks a lot!
Joey Lee