RE: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve per-numa CMA
From: Song Bao Hua (Barry Song)
Date: Fri Aug 21 2020 - 04:29:45 EST
> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx
> [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] On Behalf Of Randy Dunlap
> Sent: Friday, August 21, 2020 2:50 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>; hch@xxxxxx;
> m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx; will@xxxxxxxxxx;
> ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx
> Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> huangdaode <huangdaode@xxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>; Nicolas Saenz Julienne
> <nsaenzjulienne@xxxxxxx>; Steve Capper <steve.capper@xxxxxxx>; Andrew
> Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve
> per-numa CMA
>
> On 8/20/20 7:26 PM, Barry Song wrote:
> >
> >
> > Cc: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > Cc: Christoph Hellwig <hch@xxxxxx>
> > Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> > Cc: Will Deacon <will@xxxxxxxxxx>
> > Cc: Robin Murphy <robin.murphy@xxxxxxx>
> > Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@xxxxxxxxxx>
> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> > Cc: Nicolas Saenz Julienne <nsaenzjulienne@xxxxxxx>
> > Cc: Steve Capper <steve.capper@xxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > Signed-off-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> > ---
> > v6: rebase on top of 5.9-rc1;
> > doc cleanup
> >
> > .../admin-guide/kernel-parameters.txt | 9 ++
> > include/linux/dma-contiguous.h | 6 ++
> > kernel/dma/Kconfig | 10 ++
> > kernel/dma/contiguous.c | 100
> ++++++++++++++++--
> > 4 files changed, 115 insertions(+), 10 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> > index bdc1f33fd3d1..3f33b89aeab5 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -599,6 +599,15 @@
> > altogether. For more information, see
> > include/linux/dma-contiguous.h
> >
> > + pernuma_cma=nn[MG]
>
> memparse() allows any one of these suffixes: K, M, G, T, P, E
> and nothing in the option parsing function cares what suffix is used...
Hello Randy,
Thanks for your comments.
Actually I am following the suffix of default cma:
cma=nn[MG]@[start[MG][-end[MG]]]
[ARM,X86,KNL]
Sets the size of kernel global memory area for
contiguous memory allocations and optionally the
placement constraint by the physical address range of
memory allocations. A value of 0 disables CMA
altogether. For more information, see
include/linux/dma-contiguous.h
I suggest users should set the size in either MB or GB as they set cma.
>
> > + [ARM64,KNL]
> > + Sets the size of kernel per-numa memory area for
> > + contiguous memory allocations. A value of 0 disables
> > + per-numa CMA altogether. DMA users on node nid will
> > + first try to allocate buffer from the pernuma area
> > + which is located in node nid, if the allocation fails,
> > + they will fallback to the global default memory area.
> > +
> > cmo_free_hint= [PPC] Format: { yes | no }
> > Specify whether pages are marked as being inactive
> > when they are freed. This is used in CMO environments
>
> > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > index cff7e60968b9..89b95f10e56d 100644
> > --- a/kernel/dma/contiguous.c
> > +++ b/kernel/dma/contiguous.c
> > @@ -69,6 +69,19 @@ static int __init early_cma(char *p)
> > }
> > early_param("cma", early_cma);
> >
> > +#ifdef CONFIG_DMA_PERNUMA_CMA
> > +
> > +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES];
> > +static phys_addr_t pernuma_size_bytes __initdata;
>
> why phys_addr_t? couldn't it just be unsigned long long?
>
Mainly because of following the programming habit in kernel/dma/contiguous.c:
I think the original code probably meant the size should not be larger than the MAXIMUM
value of phys_addr_t:
/*
* Default global CMA area size can be defined in kernel's .config.
* This is useful mainly for distro maintainers to create a kernel
* that works correctly for most supported systems.
* The size can be set in bytes or as a percentage of the total memory
* in the system.
*
* Users, who want to set the size of global CMA area for their system
* should use cma= kernel parameter.
*/
static const phys_addr_t size_bytes __initconst =
(phys_addr_t)CMA_SIZE_MBYTES * SZ_1M;
static phys_addr_t size_cmdline __initdata = -1;
static phys_addr_t base_cmdline __initdata;
static phys_addr_t limit_cmdline __initdata;
void __init dma_contiguous_reserve(phys_addr_t limit)
{
phys_addr_t selected_size = 0;
phys_addr_t selected_base = 0;
phys_addr_t selected_limit = limit;
bool fixed = false;
pr_debug("%s(limit %08lx)\n", __func__, (unsigned long)limit);
if (size_cmdline != -1) {
selected_size = size_cmdline;
selected_base = base_cmdline;
selected_limit = min_not_zero(limit_cmdline, limit);
if (base_cmdline + size_cmdline == limit_cmdline)
fixed = true;
if the whole file is using phys_addr_t for size, I don't want to make the new code weird.
> OK, so cma_declare_contiguous_nid() uses phys_addr_t. Fine.
>
> > +
> > +static int __init early_pernuma_cma(char *p)
> > +{
> > + pernuma_size_bytes = memparse(p, &p);
> > + return 0;
> > +}
> > +early_param("pernuma_cma", early_pernuma_cma);
> > +#endif
> > +
> > #ifdef CONFIG_CMA_SIZE_PERCENTAGE
> >
> > static phys_addr_t __init __maybe_unused
> cma_early_percent_memory(void)
> > @@ -96,6 +109,34 @@ static inline __maybe_unused phys_addr_t
> cma_early_percent_memory(void)
> >
> > #endif
> >
> > +#ifdef CONFIG_DMA_PERNUMA_CMA
> > +void __init dma_pernuma_cma_reserve(void)
> > +{
> > + int nid;
> > +
> > + if (!pernuma_size_bytes)
> > + return;
> > +
> > + for_each_node_state(nid, N_ONLINE) {
> > + int ret;
> > + char name[20];
> > + struct cma **cma = &dma_contiguous_pernuma_area[nid];
> > +
> > + snprintf(name, sizeof(name), "pernuma%d", nid);
> > + ret = cma_declare_contiguous_nid(0, pernuma_size_bytes, 0, 0,
> > + 0, false, name, cma, nid);
> > + if (ret) {
> > + pr_warn("%s: reservation failed: err %d, node %d", __func__,
> > + ret, nid);
> > + continue;
> > + }
> > +
> > + pr_debug("%s: reserved %llu MiB on node %d\n", __func__,
> > + (unsigned long long)pernuma_size_bytes / SZ_1M, nid);
>
> Conversely, if you want to leave pernuma_size_bytes as phys_addr_t,
> you should use %pa (or %pap) to print it.
Here I think it is working as "size" in integer.
>
> > + }
> > +}
> > +#endif
Thanks
Barry