RE: [PATCH][v6] PM / hibernate: Print the possible panic reason when resuming with inconsistent e820 map

From: Chen, Yu C
Date: Sun Aug 28 2016 - 09:34:19 EST



> -----Original Message-----
> From: Pavel Machek [mailto:pavel@xxxxxx]
> Sent: Sunday, August 28, 2016 9:15 PM
> To: Chen, Yu C
> Cc: joeyli; Rafael J. Wysocki; linux-pm@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH][v6] PM / hibernate: Print the possible panic reason when
> resuming with inconsistent e820 map
>
> On Sun 2016-08-28 13:08:02, Chen, Yu C wrote:
> > > -----Original Message-----
> > > From: Pavel Machek [mailto:pavel@xxxxxx]
> > > Sent: Sunday, August 28, 2016 8:48 PM
> > > To: Chen, Yu C
> > > Cc: joeyli; Rafael J. Wysocki; linux-pm@xxxxxxxxxxxxxxx; linux-
> > > kernel@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH][v6] PM / hibernate: Print the possible panic
> > > reason when resuming with inconsistent e820 map
> > >
> > > On Sun 2016-08-28 10:07:10, Chen Yu wrote:
> > > > Hi,
> > > > On Fri, Aug 26, 2016 at 09:56:54PM +0200, Pavel Machek wrote:
> > > > > Hi!
> > > > >
> > > > > > > > > What's the progress of this patch? Looks already have
> > > > > > > > > experts
> > > review it.
> > > > > > > > > Why this patch didn't accept?
> > > > > > > > This patch is a little overkilled, and I have saved
> > > > > > > > another simpler version to only check the md5 hash (as
> > > > > > > > people suggested) for
> > > it. I can post it later.
> > > > > > > >
> > > > > > >
> > > > > > > I am happy to test and review it.
> > > > > > >
> > > > > > Here it is. As Rafael is on travel, it would be grateful if
> > > > > > you can give some advance on this, thanks!
> > > > >
> > > > > Better than last one.
> > > > >
> > > > > > + return -ENOMEM;
> > > > > > +
> > > > > > + req = ahash_request_alloc(tfm, GFP_ATOMIC);
> > > > >
> > > > > what context is this called from? GFP_ATOMIC allocations like to fail...
> > > > >
> > > > It is in normal process context, OK, I'll change it to GFP_KERNEL.
> > > > > > +static int hibernation_e820_check(void *buf) {
> > > > > > + int ret;
> > > > > > + char result[MD5_HASH_SIZE] = {0};
> > > > > > +
> > > > > > + ret = get_e820_md5(&e820_saved, result);
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > > > +
> > > > > > + if (memcmp(result, buf, MD5_HASH_SIZE))
> > > > > > + e820_conflict = true;
> > > > >
> > > > > Passing return value using global variable is ugly. Can you just
> > > > > print the warning and kill the box here?
> > > > Do you mean get rid of the panic hooker and just print the warning
> > > > here?
> > >
> > > Yep, I'd do that... (And you probably want to rise the severity).
> > Sometime the users might not be able to see the warning during
> > resume(too fast) and got a system panic later, so I was thinking if it
> > might help user to see the warning in panic too, how about this:
> >
> > if (memcmp(result, buf, MD5_DIGEST_SIZE)) {
> > pr_err("PM: e820 map conflict detected!\n");
> > register_die_notifier(&hibernation_die_notifier);
> > }
> > So we can print warning in hibernation_die_notifier without introducing a
> global variable?
> >
>
> Actually, I'd kill the machine right away.
>
> if (memcmp(result, buf, MD5_DIGEST_SIZE)) {
> pr_err("PM: e820 map conflict detected!\n");
> panic("BIOS is playing funny tricks with us.\n");
> }
>
OK, this is more reasonable, will take this one.

Thanks,
Yu