Re: [PATCH v2 2/4] libnvdimm/namespace: Enforce memremap_compat_align()

From: Aneesh Kumar K.V
Date: Fri Feb 14 2020 - 11:55:41 EST


On 2/14/20 10:14 PM, Jeff Moyer wrote:
Dan Williams <dan.j.williams@xxxxxxxxx> writes:

On Thu, Feb 13, 2020 at 1:55 PM Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:

Dan Williams <dan.j.williams@xxxxxxxxx> writes:

The pmem driver on PowerPC crashes with the following signature when
instantiating misaligned namespaces that map their capacity via
memremap_pages().

BUG: Unable to handle kernel data access at 0xc001000406000000
Faulting instruction address: 0xc000000000090790
NIP [c000000000090790] arch_add_memory+0xc0/0x130
LR [c000000000090744] arch_add_memory+0x74/0x130
Call Trace:
arch_add_memory+0x74/0x130 (unreliable)
memremap_pages+0x74c/0xa30
devm_memremap_pages+0x3c/0xa0
pmem_attach_disk+0x188/0x770
nvdimm_bus_probe+0xd8/0x470

With the assumption that only memremap_pages() has alignment
constraints, enforce memremap_compat_align() for
pmem_should_map_pages(), nd_pfn, or nd_dax cases.

Reported-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
Cc: Jeff Moyer <jmoyer@xxxxxxxxxx>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/158041477336.3889308.4581652885008605170.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
---
drivers/nvdimm/namespace_devs.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 032dc61725ff..aff1f32fdb4f 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -1739,6 +1739,16 @@ struct nd_namespace_common *nvdimm_namespace_common_probe(struct device *dev)
return ERR_PTR(-ENODEV);
}

+ if (pmem_should_map_pages(dev) || nd_pfn || nd_dax) {
+ struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev);
+ resource_size_t start = nsio->res.start;
+
+ if (!IS_ALIGNED(start | size, memremap_compat_align())) {
+ dev_dbg(&ndns->dev, "misaligned, unable to map\n");
+ return ERR_PTR(-EOPNOTSUPP);
+ }
+ }
+
if (is_namespace_pmem(&ndns->dev)) {
struct nd_namespace_pmem *nspm;


Actually, I take back my ack. :) This prevents a previously working
namespace from being successfully probed/setup.

Do you have a test case handy? I can see a potential gap with a
namespace that used internal padding to fix up the alignment.

# ndctl list -v -n namespace0.0
[
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":52846133248,
"uuid":"b99f6f6a-2909-4189-9bfa-6eeebd95d40e",
"raw_uuid":"aff43777-015b-493f-bbf9-7c7b0fe33519",
"sector_size":512,
"align":4096,
"blockdev":"pmem0",
"numa_node":0
}
]

# cat /sys/bus/nd/devices/region0/mappings
6

# grep namespace0.0 /proc/iomem
1860000000-24e0003fff : namespace0.0

The goal of this check is to catch cases that are just going to fail
devm_memremap_pages(), and the expectation is that it could not have
worked before unless it was ported from another platform, or someone
flipped the page-size switch on PowerPC.

On x86, creation and probing of the namespace worked fine before this
patch. What *doesn't* work is creating another fsdax namespace after
this one. sector mode namespaces can still be created, though:

[
{
"dev":"namespace0.1",
"mode":"sector",
"size":53270768640,
"uuid":"67ea2c74-d4b1-4fc9-9c1a-a7d2a6c2a4a7",
"sector_size":512,
"blockdev":"pmem0.1s"
},

# grep namespace0.1 /proc/iomem
24e0004000-3160007fff : namespace0.1

I thought we were only going to enforce the alignment for a newly
created namespace? This should only check whether the alignment
works for the current platform.

The model is a new default 16MB alignment is enforced at creation
time, but if you need to support previously created namespaces then
you can manually trim that alignment requirement to no less than
memremap_compat_align() because that's the point at which
devm_memremap_pages() will start failing or crashing.

The problem is that older kernels did not enforce alignment to
SUBSECTION_SIZE. We shouldn't prevent those namespaces from being
accessed. The probe itself will not cause the WARN_ON to trigger.
Creating new namespaces at misaligned addresses could, but you've
altered the free space allocation such that we won't hit that anymore.

If I drop this patch, the probe will still work, and allocating new
namespaces will also work:

# ndctl list
[
{
"dev":"namespace0.1",
"mode":"sector",
"size":53270768640,
"uuid":"67ea2c74-d4b1-4fc9-9c1a-a7d2a6c2a4a7",
"sector_size":512,
"blockdev":"pmem0.1s"
},
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":52846133248,
"uuid":"b99f6f6a-2909-4189-9bfa-6eeebd95d40e",
"sector_size":512,
"align":4096,
"blockdev":"pmem0"
}
]
ndctl create-namespace -m fsdax -s 36g -r 0
{
"dev":"namespace0.2",
"mode":"fsdax",
"map":"dev",
"size":"35.44 GiB (38.05 GB)",
"uuid":"7893264c-c7ef-4cbe-95e1-ccf2aff041fb",
"sector_size":512,
"align":2097152,
"blockdev":"pmem0.2"
}

proc/iomem:

1860000000-d55fffffff : Persistent Memory
1860000000-24e0003fff : namespace0.0
24e0004000-3160007fff : namespace0.1
3162000000-3a61ffffff : namespace0.2

So, maybe the right thing is to make memremap_compat_align return
PAGE_SIZE for x86 instead of SUBSECTION_SIZE?



I did that as part of https://lore.kernel.org/linux-nvdimm/20200120140749.69549-2-aneesh.kumar@xxxxxxxxxxxxx and applied the subsection details only when creating new namespace

https://lore.kernel.org/linux-nvdimm/20200120140749.69549-4-aneesh.kumar@xxxxxxxxxxxxx


But I do agree with the approach that in-order to create a compatible namespace we need enforce max possible align value across all supported architectures.


On POWER we should still be able to enforce SUBSECTION_SIZE restrictions. We did put that as document w.r.t. distributions like Suse https://www.suse.com/support/kb/doc/?id=7024300



-aneesh