Re: [PATCH] mm/memcg: support control THP behaviour in cgroup

From: CGEL
Date: Mon May 09 2022 - 07:27:12 EST


On Mon, May 09, 2022 at 12:00:28PM +0200, Michal Hocko wrote:
> On Sat 07-05-22 02:05:25, CGEL wrote:
> [...]
> > If there are many containers to run on one host, and some of them have high
> > performance requirements, administrator could turn on thp for them:
> > # docker run -it --thp-enabled=always
> > Then all the processes in those containers will always use thp.
> > While other containers turn off thp by:
> > # docker run -it --thp-enabled=never
>
> I do not know. The THP config space is already too confusing and complex
> and this just adds on top. E.g. is the behavior of the knob
> hierarchical? What is the policy if parent memcg says madivise while
> child says always? How does the per-application configuration aligns
> with all that (e.g. memcg policy madivise but application says never via
> prctl while still uses some madvised - e.g. via library).
>

The cgroup THP behavior is align to host and totally independent just likes
/sys/fs/cgroup/memory.swappiness. That means if one cgroup config 'always'
for thp, it has no matter with host or other cgroup. This make it simple for
user to understand or control.

If memcg policy madivise but application says never, just like host, the result
is no THP for that application.

> > By doing this we could promote important containers's performance with less
> > footprint of thp.
>
> Do we really want to provide something like THP based QoS? To me it
> sounds like a bad idea and if the justification is "it might be useful"
> then I would say no. So you really need to come with a very good usecase
> to promote this further.

At least on some 5G(communication technology) machine, it's useful to provide
THP based QoS. Those 5G machine use micro-service software architecture, in
other words one service application runs in one container. Container becomes
the suitable management unit but not the whole host. And some performance
sensitive containers desiderate THP to provide low latency communication.
But if we use THP with 'always', it will consume more memory(on our machine
that is about 10% of total memory). And unnecessary huge pages will increase
memory pressure, add latency for minor pages faults, and add overhead when
splitting huge pages or coalescing normal sized pages into huge pages.

So container manager should welcome cgroup based THP QoS.

> --
> Michal Hocko
> SUSE Labs