RE: [External] Re: [PATCH 0/3] libnvdimm: reset seeds for next namespace creation

From: Ocean HY1 He
Date: Tue Sep 11 2018 - 04:50:07 EST




> -----Original Message-----
> From: Dan Williams <dan.j.williams@xxxxxxxxx>
> Sent: Tuesday, September 11, 2018 8:51 AM
> To: Ocean He <oceanhehy@xxxxxxxxx>
> Cc: zwisler@xxxxxxxxxx; Vishal L Verma <vishal.l.verma@xxxxxxxxx>; Dave Jiang
> <dave.jiang@xxxxxxxxx>; linux-nvdimm <linux-nvdimm@xxxxxxxxxxxx>; Linux
> Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Ocean HY1 He
> <hehy1@xxxxxxxxxx>
> Subject: [External] Re: [PATCH 0/3] libnvdimm: reset seeds for next
> namespace creation
>
> On Sun, Sep 9, 2018 at 11:21 PM, Ocean He <oceanhehy@xxxxxxxxx> wrote:
> > From: Ocean He <hehy1@xxxxxxxxxx>
> >
> > When pmem namespaces created are smaller than section size twice, the
> > second creation would fail and meanwhile there is a kernel call trace
> > which comes from commit 15d36fecd0bdc7510b70 ("mm: disallow mappings
> that
> > conflict for devm_memremap_pages()").
> > ------------[ cut here ]------------
> > nd_pmem pfn1.1: Conflicting mapping in same section
> > WARNING: CPU: 84 PID: 51974 at kernel/memremap.c:194
> devm_memremap_pages+0x4a0/0x4e0
> > CPU: 84 PID: 51974 Comm: ndctl Kdump: loaded Tainted: G W E 4.19.0-
> rc2-23-default+ #27
> > RIP: 0010:devm_memremap_pages+0x4a0/0x4e0
> > Call Trace:
> > pmem_attach_disk+0x3ab/0x581 [nd_pmem]
> > nvdimm_bus_probe+0x69/0x150 [libnvdimm]
> > really_probe+0x262/0x3d0
> > driver_probe_device+0x60/0x120
> > bind_store+0x102/0x190
> > kernfs_fop_write+0x105/0x180
> > __vfs_write+0x36/0x1a0
> > ? common_file_perm+0x47/0x130
> > ? security_file_permission+0x2c/0xb0
> > vfs_write+0xad/0x1a0
> > ksys_write+0x52/0xc0
> > do_syscall_64+0x5b/0x180
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Here is an example (section size is 128MB) based on kernel 4.19-rc2.
> > # ndctl create-namespace -r region1 -s 100m -t pmem -m fsdax
> > {
> > "dev":"namespace1.0",
> > "mode":"fsdax",
> > "map":"dev",
> > "size":"96.00 MiB (100.66 MB)",
> > "uuid":"ef9a0556-a610-40b5-8c71-43991765a2cc",
> > "raw_uuid":"177b22e2-b7e8-482f-a063-2b8de876d979",
> > "sector_size":512,
> > "blockdev":"pmem1",
> > "numa_node":1
> > }
> > # ndctl create-namespace -r region1 -s 100m -t pmem -m fsdax
> > libndctl: ndctl_pfn_enable: pfn1.1: failed to enable
> > Error: namespace1.1: failed to enable
> > failed to create namespace: No such device or address
> >
> > When above second creation failure occurs, the expectation is to destroy
> > namespace1.0 to create a new namespace which size is aligned with section
> > size. However, both namespace seed and pfn seed have been consumed,
> the
> > new namespace creation still fails.
> > # ndctl destroy-namespace namespace1.0 -f
> > destroyed 1 namespace
> > # ndctl create-namespace -r region1 -s 128m -t pmem -m fsdax
> > failed to create namespace: Device or resource busy
> >
> > To ensure pfn_seed/dax_seed and namespace_seed are always ready for
> next
> > namespace creation, this patch set enables seed detach and reset. Back to
> > the example, the new namespace creation never fails if this patch set
> > applied.
> > # ndctl destroy-namespace namespace1.0 -f
> > destroyed 1 namespace
> > # ndctl create-namespace -r region1 -s 128m -t pmem -m fsdax
> > {
> > "dev":"namespace1.0",
> > "mode":"fsdax",
> > "map":"dev",
> > "size":"124.00 MiB (130.02 MB)",
> > "uuid":"0d0e7506-d108-4a88-824a-edef26fd0399",
> > "raw_uuid":"efeb9647-12f5-44cd-8a52-2f3a0d14589a",
> > "sector_size":512,
> > "blockdev":"pmem1",
> > "numa_node":1
> > }
> > # ndctl create-namespace -r region1 -s 128m -t pmem -m fsdax
> > {
> > "dev":"namespace1.1",
> > "mode":"fsdax",
> > "map":"dev",
> > "size":130023424,
> > "uuid":"689828dc-8779-434d-8e93-0406d4e1e536",
> > "raw_uuid":"d86e1025-c224-48b6-b2a7-6ccef152d5fd",
> > "sector_size":512,
> > "blockdev":"pmem1.1",
> > "numa_node":1
> > }
> >
> > The mode devdax (-m devdax) has the same issue, this patch set could
> > cover it.
>
> This is good analysis, but I believe this is better fixed / handled in
> ndctl directly. This is just one of a few reasons that namespace
> creation can fail, and it should be ndctl's job to recover from failed
> creation. The kernel only provides the mechanism the policy of what to
> do with errors and interrupted namespace creation is up to userspace.
>
Well, thanks for your review. I just send out the patch of ndctl for this
issue, please help to review again. Many thanks!
https://lists.01.org/pipermail/linux-nvdimm/2018-September/017778.html

Ocean.
> Also, in the future, the plan is to allow namespaces smaller than a
> section size which will fix this particular failing condition
> properly.
I am interesting that what minimal size is allowed for namespace creation.
I need this to guide the NVDIMM enablement on Lenovo ThinkSystem Servers.

I see function nvdimm_namespace_common_probe return error if size is less
than ND_MIN_NAMESPACE_SIZE(equals to PAGE_SIZE).
size = nvdimm_namespace_capacity(ndns);
if (size < ND_MIN_NAMESPACE_SIZE) {
dev_dbg(&ndns->dev, "%pa, too small must be at least %#x\n",
&size, ND_MIN_NAMESPACE_SIZE);
return ERR_PTR(-ENODEV);
}

I also see function nd_namespace_store return error if size is less than SZ_16M.
if (__nvdimm_namespace_capacity(ndns) < SZ_16M) {
dev_dbg(dev, "%s too small to host\n", name);
len = -ENXIO;
goto out_attach;
}

Ocean.