Re: [PATCH v4 1/5] mm/memremap_pages: Introduce memremap_compat_align()
From: Michael Ellerman
Date: Wed Mar 11 2020 - 23:17:16 EST
Dan Williams <dan.j.williams@xxxxxxxxx> writes:
> The "sub-section memory hotplug" facility allows memremap_pages() users
> like libnvdimm to compensate for hardware platforms like x86 that have a
> section size larger than their hardware memory mapping granularity. The
> compensation that sub-section support affords is being tolerant of
> physical memory resources shifting by units smaller (64MiB on x86) than
> the memory-hotplug section size (128 MiB). Where the platform
> physical-memory mapping granularity is limited by the number and
> capability of address-decode-registers in the memory controller.
>
> While the sub-section support allows memremap_pages() to operate on
> sub-section (2MiB) granularity, the Power architecture may still
> require 16MiB alignment on "!radix_enabled()" platforms.
>
> In order for libnvdimm to be able to detect and manage this per-arch
> limitation, introduce memremap_compat_align() as a common minimum
> alignment across all driver-facing memory-mapping interfaces, and let
> Power override it to 16MiB in the "!radix_enabled()" case.
>
> The assumption / requirement for 16MiB to be a viable
> memremap_compat_align() value is that Power does not have platforms
> where its equivalent of address-decode-registers never hardware remaps a
> persistent memory resource on smaller than 16MiB boundaries. Note that I
> tried my best to not add a new Kconfig symbol, but header include
> entanglements defeated the #ifndef memremap_compat_align design pattern
> and the need to export it defeats the __weak design pattern for arch
> overrides.
>
> Based on an initial patch by Aneesh.
>
> Link: http://lore.kernel.org/r/CAPcyv4gBGNP95APYaBcsocEa50tQj9b5h__83vgngjq3ouGX_Q@xxxxxxxxxxxxxx
> Reported-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
> Reported-by: Jeff Moyer <jmoyer@xxxxxxxxxx>
> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
> Cc: Paul Mackerras <paulus@xxxxxxxxx>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/mm/ioremap.c | 21 +++++++++++++++++++++
> drivers/nvdimm/pfn_devs.c | 2 +-
> include/linux/memremap.h | 8 ++++++++
> include/linux/mmzone.h | 1 +
> lib/Kconfig | 3 +++
> mm/memremap.c | 23 +++++++++++++++++++++++
> 7 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 497b7d0b2d7e..e6ffe905e2b9 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -122,6 +122,7 @@ config PPC
> select ARCH_HAS_GCOV_PROFILE_ALL
> select ARCH_HAS_KCOV
> select ARCH_HAS_HUGEPD if HUGETLB_PAGE
> + select ARCH_HAS_MEMREMAP_COMPAT_ALIGN
> select ARCH_HAS_MMIOWB if PPC64
> select ARCH_HAS_PHYS_TO_DMA
> select ARCH_HAS_PMEM_API
> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
> index fc669643ce6a..b1a0aebe8c48 100644
> --- a/arch/powerpc/mm/ioremap.c
> +++ b/arch/powerpc/mm/ioremap.c
> @@ -2,6 +2,7 @@
>
> #include <linux/io.h>
> #include <linux/slab.h>
> +#include <linux/mmzone.h>
> #include <linux/vmalloc.h>
> #include <asm/io-workarounds.h>
>
> @@ -97,3 +98,23 @@ void __iomem *do_ioremap(phys_addr_t pa, phys_addr_t offset, unsigned long size,
>
> return NULL;
> }
> +
> +#ifdef CONFIG_ZONE_DEVICE
> +/*
> + * Override the generic version in mm/memremap.c.
> + *
> + * With hash translation, the direct-map range is mapped with just one
> + * page size selected by htab_init_page_sizes(). Consult
> + * mmu_psize_defs[] to determine the minimum page size alignment.
> +*/
> +unsigned long memremap_compat_align(void)
> +{
> + unsigned int shift = mmu_psize_defs[mmu_linear_psize].shift;
> +
> + if (radix_enabled())
> + return SUBSECTION_SIZE;
> + return max(SUBSECTION_SIZE, 1UL << shift);
> +
> +}
> +EXPORT_SYMBOL_GPL(memremap_compat_align);
> +#endif
LGTM.
Acked-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx> (powerpc)
cheers