Re: [PATCH v3] x86/e820: Fix handling of subpage regions when calculating nosave ranges
From: Ingo Molnar
Date: Mon Apr 07 2025 - 13:42:38 EST
* Myrrh Periwinkle <myrrhperiwinkle@xxxxxxxxxxx> wrote:
> On 4/7/25 01:36, Ingo Molnar wrote:
>
> > * Ingo Molnar<mingo@xxxxxxxxxx> wrote:
> >
> > > * Myrrh Periwinkle<myrrhperiwinkle@xxxxxxxxxxx> wrote:
> > >
> > > > The current implementation of e820__register_nosave_regions suffers from
> > > > multiple serious issues:
> > > > - The end of last region is tracked by PFN, causing it to find holes
> > > > that aren't there if two consecutive subpage regions are present
> > > > - The nosave PFN ranges derived from holes are rounded out (instead of
> > > > rounded in) which makes it inconsistent with how explicitly reserved
> > > > regions are handled
> > > >
> > > > Fix this by:
> > > > - Treating reserved regions as if they were holes, to ensure consistent
> > > > handling (rounding out nosave PFN ranges is more correct as the
> > > > kernel does not use partial pages)
> > > > - Tracking the end of the last RAM region by address instead of pages
> > > > to detect holes more precisely
> > > >
> > > > Cc:stable@xxxxxxxxxxxxxxx
> > > > Fixes: e5540f875404 ("x86/boot/e820: Consolidate 'struct e820_entry *entry' local variable names")
> > > So why is this SHA1 indicated as the root cause? AFAICS that commit
> > > does nothing but cleanups, so it cannot cause such regressions.
> > BTW.:
> >
> > A) "It was the first random commit that seemed related, sry"
> > B) "It's a 15 years old bug, but I wanted to indicate a fresh, 8-year old bug to get this into -stable. Busted!"
>
> You got me :) How did you know that this is a 15 years old bug?
Call it a 'regression radar' that every kernel maintainer develops
after their first 20 years or so - each bug has a distinct feeling
to them, and this one felt genuinely *ancient*.
> [...] (although I didn't think the age of the bug a patch fixes would
> affect its chances of getting to -stable)
Yeah, it doesn't really affect its -stable elibility much once we move
outside the ~6-12 months window that upstream recognizes as a
semi-recent regression - it was mostly my lame attempt at deadpan
humor, trying to play off 15 year old bugs against 8 year old bugs as
if 8 years old bugs were fresh. Yeah, I know, it's not funny even to me
anymore, I'm weird that way. ;-)
> This specific revision was picked since it's the latest one that this
> patch can be straightforwardly applied to (there is a (trivial) merge
> conflict with -stable, though).
Yeah. So in the x86/urgent commit I've tagged the other commit you
pinpointed in your followup mail:
Fixes: e8eff5ac294e ("[PATCH] Make swsusp avoid memory holes and reserved memory regions on x86_64")
Just to give backporters *some* chance at fixing this ancient bug
in older kernels, if they really want to.
Thanks,
Ingo