Re: [PATCH] memcg, hugetlb: pages allocated for hugetlb's overcommit will be charged to memcg

From: TSUKADA Koutaro
Date: Mon May 07 2018 - 20:35:58 EST


On 2018/05/04 13:26, Mike Kravetz wrote:
> Thank you for the explanation of your use case. I now understand what
> you were trying to accomplish with your patch.
>
> Your use case reminds me of the session at LSFMM titled "swap accounting".
> https://lwn.net/Articles/753162/
>
> I hope someone with more cgroup expertise (Johannes? Aneesh?) can provide
> comments. My experience in that area is limited.

I am waiting for comments from expertise. The point is whether the surplus
hugetlb page that allocated from buddy pool directly should be charged to
memory cgroup or not.

> One question that comes to mind is "Why would the user/application writer
> use hugetlbfs overcommit instead of THP?". For hugetlbfs overcommit, they
> need to be prepared for huge page allocations to fail. So, in some cases
> they may not be able to use any hugetlbfs pages. This is not much different
> than THP. However, in the THP case huge page allocation failures and fall
> back to base pages is transparent to the user. With THP, the normal memory
> cgroup controller should work well.

Certainly THP is much easier to use than hugetlb in 4KB page size kernel.
On the other hand, some distributions(SUSE, RHEL) have a page size of 64KB,
and the THP size in that case is 512MB(not 2MB). I am afraid that 512MB of
huge page is somewhat difficult to use.

In hugetlbfs, page size variation increases by using contiguous bits
supported by aarch64 architecture, and 2MB, 512MB, 16GB, 4TB can be used
in 64KB environment(Actually, only 2MB is usable...). I also believe THP
is the best in the 4KB environment, but I am considering how to use the
huge page in the 64KB environment.
--
Tsukada Koutaro