Re: [PATCH 7/7] libnvdimm/pfn: Fix 'start_pad' implementation

From: Dan Williams
Date: Fri Feb 22 2019 - 12:12:52 EST


On Fri, Feb 22, 2019 at 7:42 AM Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
>
> Dan Williams <dan.j.williams@xxxxxxxxx> writes:
>
> >> > However, to fix this situation a non-backwards compatible change
> >> > needs to be made to the interpretation of the nd_pfn info-block.
> >> > ->start_pad needs to be accounted in ->map.map_offset (formerly
> >> > ->data_offset), and ->map.map_base (formerly ->phys_addr) needs to be
> >> > adjusted to the section aligned resource base used to establish
> >> > ->map.map formerly (formerly ->virt_addr).
> >> >
> >> > The guiding principles of the info-block compatibility fixup is to
> >> > maintain the interpretation of ->data_offset for implementations like
> >> > the EFI driver that only care about data_access not dax, but cause older
> >> > Linux implementations that care about the mode and dax to fail to parse
> >> > the new info-block.
> >>
> >> What if the core mm grew support for hotplug on sub-section boundaries?
> >> Would't that fix this problem (and others)?
> >
> > Yes, I think it would, and I had patches along these lines [2]. Last
> > time I looked at this I was asked by core-mm folks to await some
> > general refactoring of hotplug [3], and I wasn't proud about some of
> > the hacks I used to make it work. In general I'm less confident about
> > being able to get sub-section-hotplug over the goal line (core-mm
> > resistance to hotplug complexity) vs the local hacks in nvdimm to deal
> > with this breakage.
>
> You first posted that patch series in December of 2016. How long do we
> wait for this refactoring to happen?
>
> Meanwhile, we've been kicking this can down the road for far too long.
> Simple namespace creation fails to work. For example:
>
> # ndctl create-namespace -m fsdax -s 128m
> Error: '--size=' must align to interleave-width: 6 and alignment: 2097152
> did you intend --size=132M?
>
> failed to create namespace: Invalid argument
>
> ok, I can't actually create a small, section-aligned namespace. Let's
> bump it up:
>
> # ndctl create-namespace -m fsdax -s 132m
> {
> "dev":"namespace1.0",
> "mode":"fsdax",
> "map":"dev",
> "size":"126.00 MiB (132.12 MB)",
> "uuid":"2a5f8fe0-69e2-46bf-98bc-0f5667cd810a",
> "raw_uuid":"f7324317-5cd2-491e-8cd1-ad03770593f2",
> "sector_size":512,
> "blockdev":"pmem1",
> "numa_node":1
> }
>
> Great! Now let's create another one.
>
> # ndctl create-namespace -m fsdax -s 132m
> libndctl: ndctl_pfn_enable: pfn1.1: failed to enable
> Error: namespace1.2: failed to enable
>
> failed to create namespace: No such device or address
>
> (along with a kernel warning spew)

I assume you're seeing this on the libnvdimm-pending branch?

> And at this point, all further ndctl create-namespace commands fail.
> Lovely. This is a wart that was acceptable only because a fix was
> coming. 2+ years later, and we're still adding hacks to work around it
> (and there have been *several* hacks).

True.

>
> > Local hacks are always a sad choice, but I think leaving these
> > configurations stranded for another kernel cycle is not tenable. It
> > wasn't until the github issue did I realize that the problem was
> > happening in the wild on NVDIMM-N platforms.
>
> I understand the desire for expediency. At some point, though, we have
> to address the root of the problem.

Well, you've defibrillated me back to reality. We've suffered the
incomplete broken hacks for 2 years, what's another 10 weeks? I'll
dust off the sub-section patches and take another run at it.