Re: [PATCH 1/2] gfp: add __GFP_NOACCOUNT

From: Michal Hocko
Date: Wed May 06 2015 - 07:59:49 EST


On Tue 05-05-15 12:45:42, Vladimir Davydov wrote:
> Not all kmem allocations should be accounted to memcg. The following
> patch gives an example when accounting of a certain type of allocations
> to memcg can effectively result in a memory leak.

> This patch adds the __GFP_NOACCOUNT flag which if passed to kmalloc
> and friends will force the allocation to go through the root
> cgroup. It will be used by the next patch.

The name of the flag is way too generic. It is not clear that the
accounting is KMEMCG related. __GFP_NO_KMEMCG sounds better?

I was going to suggest doing per-cache rather than gfp flag and that
would actually work just fine for the kmemleak as it uses its own cache
already. But the ida_simple_get would be trickier because it doesn't use
any special cache and more over only one user seem to have a problem so
this doesn't sound like a good fit.

So I do not object to opt-out for kmemcg accounting but I really think
the name should be changed.

> Note, since in case of kmemleak enabled each kmalloc implies yet another
> allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
> gfp_kmemleak_mask.

> Signed-off-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
> ---
> include/linux/gfp.h | 2 ++
> include/linux/memcontrol.h | 4 ++++
> mm/kmemleak.c | 3 ++-
> 3 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 97a9373e61e8..37c422df2a0f 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -30,6 +30,7 @@ struct vm_area_struct;
> #define ___GFP_HARDWALL 0x20000u
> #define ___GFP_THISNODE 0x40000u
> #define ___GFP_RECLAIMABLE 0x80000u
> +#define ___GFP_NOACCOUNT 0x100000u
> #define ___GFP_NOTRACK 0x200000u
> #define ___GFP_NO_KSWAPD 0x400000u
> #define ___GFP_OTHER_NODE 0x800000u
> @@ -87,6 +88,7 @@ struct vm_area_struct;
> #define __GFP_HARDWALL ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
> #define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
> #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
> +#define __GFP_NOACCOUNT ((__force gfp_t)___GFP_NOACCOUNT) /* Don't account to memcg */
> #define __GFP_NOTRACK ((__force gfp_t)___GFP_NOTRACK) /* Don't track with kmemcheck */
>
> #define __GFP_NO_KSWAPD ((__force gfp_t)___GFP_NO_KSWAPD)
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 72dff5fb0d0c..6c8918114804 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -463,6 +463,8 @@ memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **memcg, int order)
> if (!memcg_kmem_enabled())
> return true;
>
> + if (gfp & __GFP_NOACCOUNT)
> + return true;
> /*
> * __GFP_NOFAIL allocations will move on even if charging is not
> * possible. Therefore we don't even try, and have this allocation
> @@ -522,6 +524,8 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
> {
> if (!memcg_kmem_enabled())
> return cachep;
> + if (gfp & __GFP_NOACCOUNT)
> + return cachep;
> if (gfp & __GFP_NOFAIL)
> return cachep;
> if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 5405aff5a590..f0fe4f2c1fa7 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -115,7 +115,8 @@
> #define BYTES_PER_POINTER sizeof(void *)
>
> /* GFP bitmask for kmemleak internal allocations */
> -#define gfp_kmemleak_mask(gfp) (((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
> +#define gfp_kmemleak_mask(gfp) (((gfp) & (GFP_KERNEL | GFP_ATOMIC | \
> + __GFP_NOACCOUNT)) | \
> __GFP_NORETRY | __GFP_NOMEMALLOC | \
> __GFP_NOWARN)
>
> --
> 1.7.10.4
>

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/