RE: [PATCH] memremap: Fix NULL pointer BUG in get_zone_device_page()
From: Kani, Toshimitsu
Date: Tue Aug 23 2016 - 23:06:01 EST
> On Tue, Aug 23, 2016 at 4:47 PM, Kani, Toshimitsu <toshi.kani@xxxxxxx>
> wrote:
> > On Tue, 2016-08-23 at 15:32 -0700, Dan Williams wrote:
> >> On Tue, Aug 23, 2016 at 11:43 AM, Toshi Kani <toshi.kani@xxxxxxx>
> >> wrote:
> > :
> >> I'm not sure about this fix. The point of honoring
> >> vmem_altmap_offset() is because a portion of the resource that is
> >> passed to devm_memremap_pages() also contains the metadata info
> block
> >> for the device. The offset says "use everything past this point for
> >> pages". This may work for avoiding a crash, but it may corrupt info
> >> block metadata in the process. Can you provide more information
> >> about the failing scenario to be sure that we are not triggering a
> >> fault on an address that is not meant to have a page mapping? I.e.
> >> what is the host physical address of the page that caused this fault,
> >> and is it valid?
> >
> > The fault address in question was the 2nd page of an NVDIMM range. I
> > assumed this fault address was valid and needed to be handled. Here is
> > some info about the base and patched cases. Let me know if you need
> > more info.
> >
> > Base
> > ====
> >
> > The following NVDIMM range was set to /dev/dax.
>
> With ndctl create-namespace or manually via sysfs? Specifically I'm
> looking for what the 'align' attribute was set to when the
> configuration was established. Can you provide a dump of the sysfs
> attributes for the /dev/dax parent device?
I used the ndctl command below.
ndctl create-namespace -f -e namespace0.0 -m dax
Here is additional info from my note for the base case.
p {struct dev_pagemap} 0xffff88046d0453f0
$3 = {
altmap = 0xffff88046d045410,
res = 0xffff88046d0453a8,
ref = 0xffff88046d0452f0,
dev = 0xffff880464790410
}
crash> p {struct vmem_altmap} 0xffff88046d045410
$6 = {
base_pfn = 0x480000,
reserve = 0x2, // PHYS_PFN(SZ_8K)
free = 0x101fe,
align = 0x1fe,
alloc = 0x10000
}
crash> p {struct resource} 0xffff88046d0453a8
$4 = {
start = 0x480000000,
end = 0x87fffffff,
name = 0xffff880c7da5d4a8 "region0",
flags = 0x200,
desc = 0x0,
parent = 0x0,
sibling = 0x0,
child = 0x0
}
crash> p {struct percpu_ref} 0xffff88046d0452f0
$7 = {
count = {
counter = 0x8000000000000001
},
percpu_count_ptr = 0x60f380403a98,
release = 0xffffffffa008a0a0,
confirm_switch = 0x0,
force_atomic = 0x0,
rcu = {
next = 0x0,
func = 0x0
}
}
Thanks,
-Toshi