Re: [PATCH v7 1/7] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request

From: Joonsoo Kim
Date: Thu Apr 20 2017 - 21:33:15 EST


On Mon, Apr 17, 2017 at 04:38:08PM +0900, Minchan Kim wrote:
> Hi Joonsoo,
>
> On Tue, Apr 11, 2017 at 12:17:14PM +0900, js1304@xxxxxxxxx wrote:
> > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> >
> > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > important to reserve. When ZONE_MOVABLE is used, this problem would
> > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > allocation request which is mainly used for page cache and anon page
> > allocation. So, fix it.
> >
> > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > makes code complex. For example, if there is highmem system, following
> > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > misleading people.
> >
> > #ifdef CONFIG_HIGHMEM
> > 32
> > #endif
> >
> > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > array by MAX_NR_ZONES and place "#ifdef" to right place.
> >
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> > ---
> > include/linux/mmzone.h | 2 +-
> > mm/page_alloc.c | 11 ++++++-----
> > 2 files changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index ebaccd4..96194bf 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -869,7 +869,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
> > void __user *, size_t *, loff_t *);
> > int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
> > void __user *, size_t *, loff_t *);
> > -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
> > +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
> > int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
> > void __user *, size_t *, loff_t *);
> > int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 32b31d6..60ffa4e 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -203,17 +203,18 @@ static void __free_pages_ok(struct page *page, unsigned int order);
> > * TBD: should special case ZONE_DMA32 machines here - in those we normally
> > * don't need any ZONE_NORMAL reservation
> > */
> > -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
> > +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
> > #ifdef CONFIG_ZONE_DMA
> > - 256,
> > + [ZONE_DMA] = 256,
> > #endif
> > #ifdef CONFIG_ZONE_DMA32
> > - 256,
> > + [ZONE_DMA32] = 256,
> > #endif
> > + [ZONE_NORMAL] = 32,
> > #ifdef CONFIG_HIGHMEM
> > - 32,
> > + [ZONE_HIGHMEM] = INT_MAX,
> > #endif
> > - 32,
> > + [ZONE_MOVABLE] = INT_MAX,
> > };
>
> We need to update lowmem_reserve_ratio in Documentation/sysctl/vm.txt.

Okay!

> And to me, INT_MAX is rather awkward.

I also think so.

> # cat /proc/sys/vm/lowmem_reserve_ratio
> 256 256 32 2147483647 2147483647
>
> What do you think about to use 0 or -1 as special meaning
> instead 2147483647?

I have thought it but drop it. In setup_per_zone_lowmem_reserve(),
there is a code to adjust the value to 1 if the value is less than 1.
There might be someone who (ab)use this adjustment so it's safe to use
INT_MAX.

> Anyway, it could be separate patch regardless of zone_cma
> so I hope Andrew to merge this patch regardless of other patches
> in this patchset.

Okay. I will send updated version soon.

Thanks.