RE: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve per-numa CMA

From: Song Bao Hua (Barry Song)
Date: Fri Aug 21 2020 - 05:44:17 EST




> -----Original Message-----
> From: Will Deacon [mailto:will@xxxxxxxxxx]
> Sent: Friday, August 21, 2020 9:27 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> Cc: hch@xxxxxx; m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx;
> ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> huangdaode <huangdaode@xxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>; Nicolas Saenz Julienne
> <nsaenzjulienne@xxxxxxx>; Steve Capper <steve.capper@xxxxxxx>; Andrew
> Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve
> per-numa CMA
>
> On Fri, Aug 21, 2020 at 09:13:39AM +0000, Song Bao Hua (Barry Song) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Will Deacon [mailto:will@xxxxxxxxxx]
> > > Sent: Friday, August 21, 2020 8:47 PM
> > > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> > > Cc: hch@xxxxxx; m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx;
> > > ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
> > > iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>;
> > > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > huangdaode <huangdaode@xxxxxxxxxx>; Jonathan Cameron
> > > <jonathan.cameron@xxxxxxxxxx>; Nicolas Saenz Julienne
> > > <nsaenzjulienne@xxxxxxx>; Steve Capper <steve.capper@xxxxxxx>;
> > > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Mike Rapoport
> > > <rppt@xxxxxxxxxxxxx>
> > > Subject: Re: [PATCH v6 1/2] dma-contiguous: provide the ability to
> > > reserve per-numa CMA
> > >
> > > On Fri, Aug 21, 2020 at 02:26:14PM +1200, Barry Song wrote:
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt
> > > b/Documentation/admin-guide/kernel-parameters.txt
> > > > index bdc1f33fd3d1..3f33b89aeab5 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -599,6 +599,15 @@
> > > > altogether. For more information, see
> > > > include/linux/dma-contiguous.h
> > > >
> > > > + pernuma_cma=nn[MG]
> > > > + [ARM64,KNL]
> > > > + Sets the size of kernel per-numa memory area for
> > > > + contiguous memory allocations. A value of 0 disables
> > > > + per-numa CMA altogether. DMA users on node nid will
> > > > + first try to allocate buffer from the pernuma area
> > > > + which is located in node nid, if the allocation fails,
> > > > + they will fallback to the global default memory area.
> > >
> > > What is the default behaviour if this option is not specified? Seems
> > > like that should be mentioned here.
>
> Just wanted to make sure you didn't miss this ^^

If it is not specified, the default size is 0 that means pernuma_cma is disabled.

Will put some words for this.

>
> > >
> > > > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index
> > > > 847a9d1fa634..db7a37ed35eb 100644
> > > > --- a/kernel/dma/Kconfig
> > > > +++ b/kernel/dma/Kconfig
> > > > @@ -118,6 +118,16 @@ config DMA_CMA
> > > > If unsure, say "n".
> > > >
> > > > if DMA_CMA
> > > > +
> > > > +config DMA_PERNUMA_CMA
> > > > + bool "Enable separate DMA Contiguous Memory Area for each
> NUMA
> > > Node"
> > >
> > > I don't understand the need for this config option. If you have
> > > DMA_DMA and you have NUMA, why wouldn't you want this enabled?
> >
> > Christoph preferred this in previous patchset in order to be able to
> > remove all of the code in the text if users don't use pernuma CMA.
>
> Ok, I defer to Christoph here, but maybe a "default NUMA" might work?

maybe "default NUMA && ARM64"?
Though I believe it will benefit x86, but I don't have a x86 server hardware
and real scenario to test. So I haven't put the dma_pernuma_cma_reserve()
code in arch/x86.
Hopefully some x86 guys will bring it up and remove the "&& ARM64".

>
> > > > + help
> > > > + Enable this option to get pernuma CMA areas so that devices like
> > > > + ARM64 SMMU can get local memory by DMA coherent APIs.
> > > > +
> > > > + You can set the size of pernuma CMA by specifying
> > > "pernuma_cma=size"
> > > > + on the kernel's command line.
> > > > +
> > > > comment "Default contiguous memory area size:"
> > > >
> > > > config CMA_SIZE_MBYTES
> > > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > > index cff7e60968b9..89b95f10e56d 100644
> > > > --- a/kernel/dma/contiguous.c
> > > > +++ b/kernel/dma/contiguous.c
> > > > @@ -69,6 +69,19 @@ static int __init early_cma(char *p) }
> > > > early_param("cma", early_cma);
> > > >
> > > > +#ifdef CONFIG_DMA_PERNUMA_CMA
> > > > +
> > > > +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES];
> > > > +static phys_addr_t pernuma_size_bytes __initdata;
> > > > +
> > > > +static int __init early_pernuma_cma(char *p) {
> > > > + pernuma_size_bytes = memparse(p, &p);
> > > > + return 0;
> > > > +}
> > > > +early_param("pernuma_cma", early_pernuma_cma); #endif
> > > > +
> > > > #ifdef CONFIG_CMA_SIZE_PERCENTAGE
> > > >
> > > > static phys_addr_t __init __maybe_unused
> > > cma_early_percent_memory(void)
> > > > @@ -96,6 +109,34 @@ static inline __maybe_unused phys_addr_t
> > > cma_early_percent_memory(void)
> > > >
> > > > #endif
> > > >
> > > > +#ifdef CONFIG_DMA_PERNUMA_CMA
> > > > +void __init dma_pernuma_cma_reserve(void) {
> > > > + int nid;
> > > > +
> > > > + if (!pernuma_size_bytes)
> > > > + return;
> > >
> > > If this is useful (I assume it is), then I think we should have a
> > > non-zero default value, a bit like normal CMA does via CMA_SIZE_MBYTES.
> >
> > The patchet used to have a CONFIG_PERNUMA_CMA_SIZE in
> > kernel/dma/Kconfig, but Christoph was not comfortable with it:
> > https://lore.kernel.org/linux-iommu/20200728115231.GA793@xxxxxx/
> >
> > Would you mind to hardcode the value in CONFIG_CMDLINE in
> arch/arm64/Kconfig as Christoph mentioned:
> > config CMDLINE
> > default "pernuma_cma=16M"
> >
> > If you also don't like the change in arch/arm64/Kconfig CMDLINE, I
> > guess I have to depend on users' setting in cmdline just like hugetlb_cma.
>
> Again, I defere to CHristophe for this code, so leave it like it is.
> However, the same argument applies to CMA_SIZE_MBYTES afaict, and I'm
> mainly looking for consistency.
>
> > > > + for_each_node_state(nid, N_ONLINE) {
> > >
> > > for_each_online_node() {
> > >
> > > > + int ret;
> > > > + char name[20];
> > >
> > > 20?
> > >
> > > Ah, wait, this is copy-pasta from hugetlb_cma_reserve(). Can you
> > > factor out the common parts at all?
> >
> > Actually I have a "#define CMA_MAX_NAME 64" in this commit:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/com
> > mit/?id=18e98e56f440
> >
> > the 20 in hugetlb_cma_reserve() was also made by me. If you are not
> > comfortable, I can move to CMA_MAX_NAME. do you think it does really
> > matter here? 20 seems to be long enough for this scenario.
>
> Using CMA_MAX_NAME seems sensible to me, although I'm still a bit wary
> about the code duplication between this and the hugetlb code.

If the name has no index, we don't have to maintain a local name array, so they
can simply put a const string.
Here for hugetlb_cma and pernuma_cma, it happens they both have to use
sprintf() to get a local name with index. But this kind of scenarios would be rare.

> Will

Thanks
Barry