Re: [PATCH 2/2] base/memory, hotplug: fix a kernel oops in show_valid_zones()

From: Kani, Toshimitsu
Date: Fri Jan 27 2017 - 13:24:45 EST


On Fri, 2017-01-27 at 08:48 +0100, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> On Thu, Jan 26, 2017 at 10:26:23PM +0000, Kani, Toshimitsu wrote:
> > On Thu, 2017-01-26 at 13:52 -0800, Andrew Morton wrote:
> > > On Thu, 26 Jan 2017 14:44:15 -0700 Toshi Kani <toshi.kani@xxxxxxx
> > > >
> > > wrote:
> > >
> > > > Reading a sysfs memoryN/valid_zones file leads to the following
> > > > oops when the first page of a range is not backed by struct
> > > > page. show_valid_zones() assumes that 'start_pfn' is always
> > > > valid for page_zone().
> > > >
> > > > ÂBUG: unable to handle kernel paging request at
> > > > ffffea017a000000
> > > > ÂIP: show_valid_zones+0x6f/0x160
> > > >
> > > > Since test_pages_in_a_zone() already checks holes, extend this
> > > > function to return 'valid_start' and 'valid_end' for a given
> > > > range. show_valid_zones() then proceeds with the valid range.
> > >
> > > This doesn't apply to current mainline due to changes in
> > > zone_can_shift().ÂÂPlease redo and resend.
> >
> > Sorry, I will rebase to the -mm tree and resend the patches.
> >
> > > Please also update the changelog to provide sufficient
> > > information for others to decide which kernel(s) need the
> > > fix.ÂÂIn particular: under what circumstances will it occur?ÂÂOn
> > > real machines which real people own?
> >
> > Yes, this issue happens on real x86 machines with 64GiB or more
> > memory. ÂOn such systems, the memory block size is bumped up to
> > 2GiB. [1]
> >
> > Here is an example system.ÂÂ0x3240000000 is only aligned by 1GiB
> > and its memory block starts from 0x3200000000, which is not backed
> > by struct page.
> >
> > ÂBIOS-e820: [memÂ0x0000003240000000-0x000000603fffffff] usable
> >
> > I will add the descriptions to the patch.
>
> Should it also be backported to the stable kernels to resolve the
> issue there?

Yes, it should be backported to the stable kernels. The memory block
size change was made by commit bdee237c034, which was accepted to 3.9.
However, this patch-set depends on (and fixes) the change to
test_pages_in_a_zone() made by commit 5f0f2887f4, which was accepted to
4.4. So, in the current form, I'd recommend we backport it up to 4.4.

Thanks,
-Toshi