Re: [PATCH v2 0/6] mm/devm_memremap_pages: Fix page release race

From: Dan Williams
Date: Tue May 14 2019 - 15:06:34 EST


On Tue, May 14, 2019 at 11:53 AM Jane Chu <jane.chu@xxxxxxxxxx> wrote:
>
> On 5/13/2019 12:22 PM, Logan Gunthorpe wrote:
>
> On 2019-05-08 11:05 a.m., Logan Gunthorpe wrote:
>
> On 2019-05-07 5:55 p.m., Dan Williams wrote:
>
> Changes since v1 [1]:
> - Fix a NULL-pointer deref crash in pci_p2pdma_release() (Logan)
>
> - Refresh the p2pdma patch headers to match the format of other p2pdma
> patches (Bjorn)
>
> - Collect Ira's reviewed-by
>
> [1]: https://lore.kernel.org/lkml/155387324370.2443841.574715745262628837.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> This series looks good to me:
>
> Reviewed-by: Logan Gunthorpe <logang@xxxxxxxxxxxx>
>
> However, I haven't tested it yet but I intend to later this week.
>
> I've tested libnvdimm-pending which includes this series on my setup and
> everything works great.
>
> Just wondering in a difference scenario where pmem pages are exported to
> a KVM guest, and then by mistake the user issues "ndctl destroy-namespace -f",
> will the kernel wait indefinitely until the user figures out to kill the guest
> and release the pmem pages?

It depends on whether the pages are pinned. Typically DAX memory
mappings assigned to a guest are not pinned in the host and can be
invalidated at any time. The pinning only occurs with VFIO and
device-assignment which isn't the common case, especially since that
configuration is blocked by fsdax. However, with devdax, yes you can
arrange for the system to go into an indefinite wait.

This somewhat ties back to the get_user_pages() vs DAX debate. The
indefinite stall issue with device-assignment could be addressed with
a requirement to hold a lease and expect that a lease revocation event
may escalate to SIGKILL in response to 'ndctl destroy-namespace'. The
expectation with device-dax is that it is already a raw interface with
pointy edges and caveats, but I would not be opposed to introducing a
lease semantic.