Re: [PATCH v7 0/7] Introduce ZONE_CMA

From: Michal Hocko
Date: Thu May 04 2017 - 08:47:06 EST


On Thu 04-05-17 14:33:24, Vlastimil Babka wrote:
> On 05/02/2017 03:03 PM, Michal Hocko wrote:
> > On Tue 02-05-17 10:06:01, Vlastimil Babka wrote:
> >> On 04/27/2017 05:06 PM, Michal Hocko wrote:
> >>> On Tue 25-04-17 12:42:57, Joonsoo Kim wrote:
> >>>> On Mon, Apr 24, 2017 at 03:09:36PM +0200, Michal Hocko wrote:
> >>>>> On Mon 17-04-17 11:02:12, Joonsoo Kim wrote:
> >>>>>> On Thu, Apr 13, 2017 at 01:56:15PM +0200, Michal Hocko wrote:
> >>>>>>> On Wed 12-04-17 10:35:06, Joonsoo Kim wrote:
> >>> [...]
> >>>>> not for free. For most common configurations where we have ZONE_DMA,
> >>>>> ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE all the 3 bits are already
> >>>>> consumed so a new zone will need a new one AFAICS.
> >>>>
> >>>> Yes, it requires one more bit for a new zone and it's handled by the patch.
> >>>
> >>> I am pretty sure that you are aware that consuming new page flag bits
> >>> is usually a no-go and something we try to avoid as much as possible
> >>> because we are in a great shortage there. So there really have to be a
> >>> _strong_ reason if we go that way. My current understanding that the
> >>> whole zone concept is more about a more convenient implementation rather
> >>> than a fundamental change which will solve unsolvable problems with the
> >>> current approach. More on that below.
> >>
> >> I don't see it as such a big issue. It's behind a CONFIG option (so we
> >> also don't need the jump labels you suggest later) and enabling it
> >> reduces the number of possible NUMA nodes (not page flags). So either
> >> you are building a kernel for android phone that needs CMA but will have
> >> a single NUMA node, or for a large server with many nodes that won't
> >> have CMA. As long as there won't be large servers that need CMA, we
> >> should be fine (yes, I know some HW vendors can be very creative, but
> >> then it's their problem?).
> >
> > Is this really about Android/UMA systems only? My quick grep seems to disagree
> > $ git grep CONFIG_CMA=y
> > arch/arm/configs/exynos_defconfig:CONFIG_CMA=y
> > arch/arm/configs/imx_v6_v7_defconfig:CONFIG_CMA=y
> > arch/arm/configs/keystone_defconfig:CONFIG_CMA=y
> > arch/arm/configs/multi_v7_defconfig:CONFIG_CMA=y
> > arch/arm/configs/omap2plus_defconfig:CONFIG_CMA=y
> > arch/arm/configs/tegra_defconfig:CONFIG_CMA=y
> > arch/arm/configs/vexpress_defconfig:CONFIG_CMA=y
> > arch/arm64/configs/defconfig:CONFIG_CMA=y
> > arch/mips/configs/ci20_defconfig:CONFIG_CMA=y
> > arch/mips/configs/db1xxx_defconfig:CONFIG_CMA=y
> > arch/s390/configs/default_defconfig:CONFIG_CMA=y
> > arch/s390/configs/gcov_defconfig:CONFIG_CMA=y
> > arch/s390/configs/performance_defconfig:CONFIG_CMA=y
> > arch/s390/defconfig:CONFIG_CMA=y
> >
> > I am pretty sure s390 and ppc support NUMA and aim at supporting really
> > large systems.
>
> I don't see ppc there,

config KVM_BOOK3S_64_HV
tristate "KVM for POWER7 and later using hypervisor mode in host"
depends on KVM_BOOK3S_64 && PPC_POWERNV
select KVM_BOOK3S_HV_POSSIBLE
select MMU_NOTIFIER
select CMA

fa61a4e376d21 tries to explain some more

[...]
> > Are we really ready to add another thing like that? How are distribution
> > kernels going to handle that?
>
> I still hope that generic enterprise/desktop distributions can disable
> it, and it's only used for small devices with custom kernels.
>
> The config burden is already there in any case, it just translates to
> extra migratetype and fastpath hooks, not extra zone and potentially
> less nodes.

AFAIU the extra migrate type costs nothing when there are no cma
reservations. And those hooks can be made noop behind static branch
as well. So distribution kernels do not really have to be afraid of
enabling CMA currently.

--
Michal Hocko
SUSE Labs