Re: [PATCH] memremap: Fix NULL pointer BUG in get_zone_device_page()

From: Kani, Toshimitsu
Date: Tue Aug 23 2016 - 20:03:39 EST


On Tue, 2016-08-23 at 15:32 -0700, Dan Williams wrote:
> On Tue, Aug 23, 2016 at 11:43 AM, Toshi Kani <toshi.kani@xxxxxxx>
> wrote:
Â:
> I'm not sure about this fix.ÂÂThe point of honoringÂ
> vmem_altmap_offset() is because a portion of the resource that is
> passed to devm_memremap_pages() also contains the metadata info block
> for the device.ÂÂThe offset says "use everything past this point for
> pages".ÂÂThis may work for avoiding a crash, but it may corrupt info
> block metadata in the process.ÂÂCan you provide more information
> about the failing scenario to be sure that we are not triggering a
> fault on an address that is not meant to have a page mapping?ÂÂI.e.
> what is the host physical address of the page that caused this fault,
> and is it valid?

The fault address in question was the 2nd page of an NVDIMM range. ÂI
assumed this fault address was valid and needed to be handled. ÂHere is
some info about the base and patched cases. ÂLet me know if you need
more info.

Base
====

The following NVDIMM range was set to /dev/dax.

/proc/iomem
480000000-87fffffff : Persistent Memory

devm_memremap_pages() initialized struct page from 0x490200-0x87ffff.
This left 0x48000-0x4901ff uninitialized for page->pgmap.

Âdevm_memremap_pages: pgmap 0xffff88046d0453f0
Â[0]ÂÂ: pfn 0x490200, page ffffea0012408000, pgmap ffff88046d0453f0
Â[1]ÂÂ: pfn 0x490201, page ffffea0012408040, pgmap ffff88046d0453f0
Â[2]ÂÂ: pfn 0x490202, page ffffea0012408080, pgmap ffff88046d0453f0
Â[3]ÂÂ: pfn 0x490203, page ffffea00124080c0, pgmap ffff88046d0453f0
Â[4]ÂÂ: pfn 0x490204, page ffffea0012408100, pgmap ffff88046d0453f0
 :
Â[E+1]: pfn 0x880000, page ffffea0021ffffc0, pgmap ffff88046d0453f0

The faulted page was pfn 0x480001, which was the 2nd page in the NVDIMM
range and did not have valid pgmap. ÂThis led the BUG.

ÂpfnÂÂÂÂÂÂÂÂÂÂÂÂÂÂ0x480001
ÂpageÂÂÂÂÂÂÂÂÂÂÂÂÂ0xffffea0012000040
Âpage->pgmapÂÂÂÂÂÂ0xffffea0012000060
Âpage->pgmap->ref (null)

Patch
=====

With the patch, devm_memremap_pages() initializes as follows.

Âdevm_memremap_pages: pgmap ffff880462b3b4b0
Â[0]ÂÂ: pfn 0x480000, page ffffea0012000000, pgmap ffff880462b3b4b0
Â[1]ÂÂ: pfn 0x480001, page ffffea0012000040, pgmap ffff880462b3b4b0
Â[2]ÂÂ: pfn 0x480002, page ffffea0012000080, pgmap ffff880462b3b4b0
Â[3]ÂÂ: pfn 0x480003, page ffffea00120000c0, pgmap ffff880462b3b4b0
Â[4]ÂÂ: pfn 0x480004, page ffffea0012000100, pgmap ffff880462b3b4b0
 :
Â[E+1]: pfn 0x880000, page ffffea0021ffffc0, pgmap ffff880462b3b4b0

A page fault to pfn 0x480001 is handled as it has valid pgmap.

ÂpfnÂÂÂÂÂÂÂÂÂÂÂÂÂÂ0x480001
ÂpageÂÂÂÂÂÂÂÂÂÂÂÂÂ0xffffea0012000040
Âpage->pgmapÂÂÂÂÂÂ0xffff880462b3b4b0
Âpage->pgmap->ref 0xffff880462b3b530

Its dev_pagemap and vmem_altmap are as follows.

crash> p {struct dev_pagemap} 0xffff880462b3b4b0
$2 = {
 altmap = 0xffff880462b3b4d0,
 res = 0xffff880462b3b468,
 ref = 0xffff880462b3b530,
 dev = 0xffff880463e37010
}

crash> p {struct vmem_altmap} 0xffff880462b3b4d0
$3 = {
 base_pfn = 0x480000,
 reserve = 0x2,
 free = 0x101fe,
 align = 0x1fe,
 alloc = 0x10000
}

This page entry is physically located at 0x480200040.

crash> vtop 0xffffea0012000040
VIRTUALÂÂÂÂÂÂÂÂÂÂÂPHYSICAL
ffffea0012000040ÂÂ480200040

PML4 DIRECTORY: ffffffff81c06000
PAGE DIRECTORY: 47ffe6067
ÂÂÂPUD: 47ffe6000 => 47ffe5067
ÂÂÂPMD: 47ffe5480 => 80000004802001e3
 PAGE: 480200000ÂÂ(2MB)

ÂÂÂÂÂÂPTEÂÂÂÂÂÂÂÂÂPHYSICALÂÂÂFLAGS
80000004802001e3ÂÂ480200000ÂÂ(PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX)

ÂÂÂÂÂÂPAGEÂÂÂÂÂÂÂÂPHYSICALÂÂÂÂÂÂMAPPINGÂÂÂÂÂÂÂINDEX CNT FLAGS
ffffea0012008000 480200000ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ0ÂÂÂÂÂÂÂÂ0ÂÂ1 4fffe000000400
reserved

Thanks,
-Toshi