[PATCH 00/15] get_user_pages() for dax mappings

From: Dan Williams
Date: Wed Sep 23 2015 - 00:47:09 EST


To date, we have implemented two I/O usage models for persistent memory,
PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
userspace). This series adds a third, DAX-GUP, that allows DAX mappings
to be the target of direct-i/o. It allows userspace to coordinate
DMA/RDMA from/to persitent memory.

The implementation leverages the ZONE_DEVICE mm-zone that went into
4.3-rc1 to flag pages that are owned and dynamically mapped by a device
driver. The pmem driver, after mapping a persistent memory range into
the system memmap via devm_memremap_pages(), arranges for DAX to
distinguish pfn-only versus page-backed pmem-pfns via flags in the new
__pfn_t type. The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn,
flags the resulting pte(s) inserted into the process page tables with a
new _PAGE_DEVMAP flag. Later, when get_user_pages() is walking ptes it
keys off _PAGE_DEVMAP to pin the device hosting the page range active.
Finally, get_page() and put_page() are modified to take references
against the device driver established page mapping.

Next step, more testing specifically DAX-get_user_pages() vs truncate.

Patches 1 - 3 are general compilation fixups from 0day-kbuild reports
while developing this series.

Patches 4 - 7 are minor cleanups and reworks of the devm_memremap_* api.

Patches 8 - 10 add a reference counter for pinning the pmem driver
active while it is in use. It turns out, prior to these changes, you
can reliably crash the kernel on shutdown if the pmem device is unbound
while hosting a mounted filesystem.

Patches 11 - 15 use __pfn_t and the _PAGE_DEVMAP flag to implement the
dax-gup path.

This series is built on 4.3-rc2 plus the __dax_pmd_fault fix from Ross:
https://patchwork.kernel.org/patch/7244961/

---

Dan Williams (15):
avr32: convert to asm-generic/memory_model.h
hugetlb: fix compile error on tile
frv: fix compiler warning from definition of __pmd()
x86, mm: quiet arch_add_memory()
pmem: kill memremap_pmem()
devm_memunmap: use devres_release()
devm_memremap: convert to return ERR_PTR
block, dax, pmem: reference counting infrastructure
block, pmem: fix null pointer de-reference on shutdown, check for queue death
block, dax: fix lifetime of in-kernel dax mappings
mm, dax, pmem: introduce __pfn_t
mm, dax, gpu: convert vm_insert_mixed to __pfn_t, introduce _PAGE_DEVMAP
mm, dax: convert vmf_insert_pfn_pmd() to __pfn_t
mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup
mm, x86: get_user_pages() for dax mappings


arch/alpha/include/asm/pgtable.h | 1
arch/avr32/include/asm/page.h | 8 +
arch/frv/include/asm/page.h | 2
arch/ia64/include/asm/pgtable.h | 1
arch/m68k/include/asm/page_no.h | 1
arch/parisc/include/asm/pgtable.h | 1
arch/powerpc/include/asm/pgtable.h | 1
arch/powerpc/sysdev/axonram.c | 10 +
arch/sparc/include/asm/pgtable_64.h | 2
arch/tile/include/asm/pgtable.h | 1
arch/um/include/asm/pgtable-3level.h | 1
arch/x86/include/asm/pgtable.h | 24 ++++
arch/x86/include/asm/pgtable_types.h | 7 +
arch/x86/mm/gup.c | 56 ++++++++
arch/x86/mm/init.c | 4 -
arch/x86/mm/init_64.c | 4 -
arch/x86/mm/pat.c | 4 -
block/blk-core.c | 86 ++++++++++++-
block/blk-mq-sysfs.c | 2
block/blk-mq.c | 48 ++-----
block/blk-sysfs.c | 9 +
block/blk.h | 3
drivers/block/brd.c | 6 -
drivers/gpu/drm/exynos/exynos_drm_gem.c | 3
drivers/gpu/drm/gma500/framebuffer.c | 2
drivers/gpu/drm/msm/msm_gem.c | 3
drivers/gpu/drm/omapdrm/omap_gem.c | 6 +
drivers/gpu/drm/ttm/ttm_bo_vm.c | 3
drivers/nvdimm/pmem.c | 57 +++++---
drivers/s390/block/dcssblk.c | 12 +-
fs/block_dev.c | 2
fs/dax.c | 140 +++++++++++++-------
include/asm-generic/pgtable.h | 6 +
include/linux/blkdev.h | 24 +++-
include/linux/huge_mm.h | 2
include/linux/hugetlb.h | 1
include/linux/io.h | 17 --
include/linux/mm.h | 212 +++++++++++++++++++++++++++++--
include/linux/mm_types.h | 6 +
include/linux/pfn.h | 9 +
include/linux/pmem.h | 26 ----
kernel/memremap.c | 78 +++++++++++
mm/gup.c | 11 +-
mm/huge_memory.c | 10 +
mm/hugetlb.c | 18 ++-
mm/memory.c | 17 +-
mm/swap.c | 15 ++
47 files changed, 729 insertions(+), 233 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/