RE: [PATCH] memremap: Fix NULL pointer BUG in get_zone_device_page()

From: Kani, Toshimitsu
Date: Tue Aug 23 2016 - 21:38:32 EST


> On Tue, Aug 23, 2016 at 4:47 PM, Kani, Toshimitsu <toshi.kani@xxxxxxx>
> wrote:
> > On Tue, 2016-08-23 at 15:32 -0700, Dan Williams wrote:
> >> On Tue, Aug 23, 2016 at 11:43 AM, Toshi Kani <toshi.kani@xxxxxxx>
> >> wrote:
> > :
> >> I'm not sure about this fix. The point of honoring
> >> vmem_altmap_offset() is because a portion of the resource that is
> >> passed to devm_memremap_pages() also contains the metadata info
> block
> >> for the device. The offset says "use everything past this point for
> >> pages". This may work for avoiding a crash, but it may corrupt info
> >> block metadata in the process. Can you provide more information
> >> about the failing scenario to be sure that we are not triggering a
> >> fault on an address that is not meant to have a page mapping? I.e.
> >> what is the host physical address of the page that caused this fault,
> >> and is it valid?
> >
> > The fault address in question was the 2nd page of an NVDIMM range. I
> > assumed this fault address was valid and needed to be handled. Here is
> > some info about the base and patched cases. Let me know if you need
> > more info.
> >
> > Base
> > ====
> >
> > The following NVDIMM range was set to /dev/dax.
> >
> > /proc/iomem
> > 480000000-87fffffff : Persistent Memory
> >
> > devm_memremap_pages() initialized struct page from 0x490200-0x87ffff.
>
> This seems like the start of the trouble. What happened to the first
> 1GB of the address range? I'm assuming the 'align' attribute is set
> to 2MB because we start at a pfn offset of 0x200, but this should be
> starting at 0x480200.

pfn_first() adds an offset from vmem_altmap_offset(), which is
altmap->reserve + altmap->free. You can see why it ended up
with this offset from the dumps in my previous email.

Thanks
-Toshi