Re: [PATCH 3/4] memcg: explaing memcg's gfp_mask behavior inexplicit way.

From: KAMEZAWA Hiroyuki
Date: Mon Dec 01 2008 - 20:22:15 EST


On Tue, 2 Dec 2008 00:32:29 +0000 (GMT)
Hugh Dickins <hugh@xxxxxxxxxxx> wrote:

> On Mon, 1 Dec 2008, KAMEZAWA Hiroyuki wrote:
> > mem_cgroup_xxx_charge(...gfpmask) function take gfpmask as its argument.
> > But this gfp_t is only used for check GFP_RECALIM_MASK. In other words,
> > memcg has no interst where the memory should be reclaimed from.
> > It just see usage of pages.
> >
> > Using bare gfp_t is misleading and this is a patch for explaining
> > expected behavior in explicit way. (better name/code is welcome.)
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>
> Sorry, but I hate it. You're spreading mem_cgroup ugliness throughout.
>
> This is a good demonstation of why I wanted to go the opposite way to
> Nick, why I wanted to push the masking as low as possible; I accept
> Nick's point, but please, not at the expense of being this ugly.
>
Okay ;)


> It's a pity about loop over tmpfs, IIRC without that case you could
> just remove the gfp_mask argument from every one of them - that is a
> change I would appreciate! Or am I forgetting some other cases?
>

Ah, I considered that. putting gfp_mask here is for rememebering
"this will do page reclaim".

> I think you should remove the arg from every one you can, and change
> those shmem_getpage() ones to say "gfp & GFP_RECLAIM_MASK" (just as
> we'll be changing the radix_tree_preload later).
>
I'll look into again

> Hmm, shmem_getpage()'s mem_cgroup_cache_charge(,, gfp & ~__GFP_HIGHMEM)
> has morphed into mem_cgroup_cache_charge(,, GFP_HIGHUSER_MOVABLE) in
> mmotm (well, mmo2daysago, I've not looked today). How come?
>
Maybe my patches/memcg-fix-gfp_mask-of-callers-of-charge.patch.
Hmm, I'll write replacement patch for this and remove gfp_t.

> It used to be the case that the mem_cgroup calls made their own
> memory allocations, and that could become the case again in future:
> the gfp mask was passed down for those allocations, and it was almost
> everywhere GFP_KERNEL.
>
> mem_cgroup charging may need to reclaim some memory from the memcg:
> it happens that the category of memory it goes for is HIGHUSER_MOVABLE;
> and it happens that the gfp mask for any incidental allocations it might
> want to make, also provides the GFP_RECLAIM_MASK flags for that reclaim.
> But please keep that within mem_cgroup_cache_charge_common or whatever.
>

Okay. I'll write replacment for memcg-fix-gfp_mask-of-callers-of-charge.patch.

-Kame

> Hugh
>
> >
> > include/linux/gfp.h | 19 +++++++++++++++++++
> > mm/filemap.c | 2 +-
> > mm/memory.c | 12 +++++++-----
> > mm/shmem.c | 8 ++++----
> > mm/swapfile.c | 2 +-
> > mm/vmscan.c | 3 +--
> > 6 files changed, 33 insertions(+), 13 deletions(-)
> >
> > Index: mmotm-2.6.28-Nov30/include/linux/gfp.h
> > ===================================================================
> > --- mmotm-2.6.28-Nov30.orig/include/linux/gfp.h
> > +++ mmotm-2.6.28-Nov30/include/linux/gfp.h
> > @@ -245,4 +245,23 @@ void drain_zone_pages(struct zone *zone,
> > void drain_all_pages(void);
> > void drain_local_pages(void *dummy);
> >
> > +#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> > +static inline gfp_t gfp_memcg_mask(gfp_t gfp)
> > +{
> > + gfp_t mask;
> > + /*
> > + * Memory Resource Controller memory reclaim is called to reduce usage
> > + * of memory, not to get free memory from specified area.
> > + * Remove zone constraints.
> > + */
> > + mask = gfp & GFP_RECLAIM_MASK;
> > + return mask | (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> > +}
> > +#else
> > +static inline gfp_t gfp_memcg_mask(gfp_t gfp)
> > +{
> > + return gfp;
> > +}
> > +#endif
> > +
> > #endif /* __LINUX_GFP_H */
> > Index: mmotm-2.6.28-Nov30/mm/filemap.c
> > ===================================================================
> > --- mmotm-2.6.28-Nov30.orig/mm/filemap.c
> > +++ mmotm-2.6.28-Nov30/mm/filemap.c
> > @@ -461,7 +461,7 @@ int add_to_page_cache_locked(struct page
> > VM_BUG_ON(!PageLocked(page));
> >
> > error = mem_cgroup_cache_charge(page, current->mm,
> > - gfp_mask & ~__GFP_HIGHMEM);
> > + gfp_memcg_mask(gfp_mask));
> > if (error)
> > goto out;
> >
> > Index: mmotm-2.6.28-Nov30/mm/vmscan.c
> > ===================================================================
> > --- mmotm-2.6.28-Nov30.orig/mm/vmscan.c
> > +++ mmotm-2.6.28-Nov30/mm/vmscan.c
> > @@ -1733,8 +1733,7 @@ unsigned long try_to_free_mem_cgroup_pag
> > if (noswap)
> > sc.may_swap = 0;
> >
> > - sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
> > - (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
> > + sc.gfp_mask = gfp_memcg_mask(gfp_mask);
> > zonelist = NODE_DATA(numa_node_id())->node_zonelists;
> > return do_try_to_free_pages(zonelist, &sc);
> > }
> > Index: mmotm-2.6.28-Nov30/mm/memory.c
> > ===================================================================
> > --- mmotm-2.6.28-Nov30.orig/mm/memory.c
> > +++ mmotm-2.6.28-Nov30/mm/memory.c
> > @@ -1913,7 +1913,8 @@ gotten:
> > cow_user_page(new_page, old_page, address, vma);
> > __SetPageUptodate(new_page);
> >
> > - if (mem_cgroup_newpage_charge(new_page, mm, GFP_HIGHUSER_MOVABLE))
> > + if (mem_cgroup_newpage_charge(new_page, mm,
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE)))
> > goto oom_free_new;
> >
> > /*
> > @@ -2345,7 +2346,7 @@ static int do_swap_page(struct mm_struct
> > delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> >
> > if (mem_cgroup_try_charge_swapin(mm, page,
> > - GFP_HIGHUSER_MOVABLE, &ptr) == -ENOMEM) {
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE), &ptr) == -ENOMEM) {
> > ret = VM_FAULT_OOM;
> > unlock_page(page);
> > goto out;
> > @@ -2437,7 +2438,8 @@ static int do_anonymous_page(struct mm_s
> > goto oom;
> > __SetPageUptodate(page);
> >
> > - if (mem_cgroup_newpage_charge(page, mm, GFP_HIGHUSER_MOVABLE))
> > + if (mem_cgroup_newpage_charge(page, mm,
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE)))
> > goto oom_free_page;
> >
> > entry = mk_pte(page, vma->vm_page_prot);
> > @@ -2528,8 +2530,8 @@ static int __do_fault(struct mm_struct *
> > ret = VM_FAULT_OOM;
> > goto out;
> > }
> > - if (mem_cgroup_newpage_charge(page,
> > - mm, GFP_HIGHUSER_MOVABLE)) {
> > + if (mem_cgroup_newpage_charge(page, mm,
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE))) {
> > ret = VM_FAULT_OOM;
> > page_cache_release(page);
> > goto out;
> > Index: mmotm-2.6.28-Nov30/mm/swapfile.c
> > ===================================================================
> > --- mmotm-2.6.28-Nov30.orig/mm/swapfile.c
> > +++ mmotm-2.6.28-Nov30/mm/swapfile.c
> > @@ -698,7 +698,7 @@ static int unuse_pte(struct vm_area_stru
> > int ret = 1;
> >
> > if (mem_cgroup_try_charge_swapin(vma->vm_mm, page,
> > - GFP_HIGHUSER_MOVABLE, &ptr))
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE), &ptr))
> > ret = -ENOMEM;
> >
> > pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> > Index: mmotm-2.6.28-Nov30/mm/shmem.c
> > ===================================================================
> > --- mmotm-2.6.28-Nov30.orig/mm/shmem.c
> > +++ mmotm-2.6.28-Nov30/mm/shmem.c
> > @@ -924,8 +924,8 @@ found:
> > * Charge page using GFP_HIGHUSER_MOVABLE while we can wait.
> > * charged back to the user(not to caller) when swap account is used.
> > */
> > - error = mem_cgroup_cache_charge_swapin(page,
> > - current->mm, GFP_HIGHUSER_MOVABLE, true);
> > + error = mem_cgroup_cache_charge_swapin(page, current->mm,
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE), true);
> > if (error)
> > goto out;
> > error = radix_tree_preload(GFP_KERNEL);
> > @@ -1267,7 +1267,7 @@ repeat:
> > * charge against this swap cache here.
> > */
> > if (mem_cgroup_cache_charge_swapin(swappage,
> > - current->mm, gfp, false)) {
> > + current->mm, gfp_memcg_mask(gfp), false)) {
> > page_cache_release(swappage);
> > error = -ENOMEM;
> > goto failed;
> > @@ -1385,7 +1385,7 @@ repeat:
> >
> > /* Precharge page while we can wait, compensate after */
> > error = mem_cgroup_cache_charge(filepage, current->mm,
> > - GFP_HIGHUSER_MOVABLE);
> > + gfp_memcg_mask(GFP_HIGHUSER_MOVABLE));
> > if (error) {
> > page_cache_release(filepage);
> > shmem_unacct_blocks(info->flags, 1);
> >
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/