Re: [PATCH] memcg: Ignore unprotected parent in mem_cgroup_protected()

From: Xunlei Pang
Date: Sun Jun 16 2019 - 08:02:04 EST


Hi Chris,

On 2019/6/16 PM 6:37, Chris Down wrote:
> Hi Xunlei,
>
> Xunlei Pang writes:
>> docker and various types(different memory capacity) of containers
>> are managed by k8s, it's a burden for k8s to maintain those dynamic
>> figures, simply set "max" to key containers is always welcome.
>
> Right, setting "max" is generally a fine way of going about it.
>
>> Set "max" to docker also protects docker cgroup memory(as docker
>> itself has tasks) unnecessarily.
>
> That's not correct -- leaf memcgs have to _explicitly_ request memory
> protection. From the documentation:
>
> ÂÂ memory.low
>
> ÂÂ [...]
>
>  Best-effort memory protection. If the memory usages of a
> ÂÂ cgroup and all its ancestors are below their low boundaries,
> ÂÂ the cgroup's memory won't be reclaimed unless memory can be
> ÂÂ reclaimed from unprotected cgroups.
>
> Note the part that the cgroup itself also must be within its low
> boundary, which is not implied simply by having ancestors that would
> permit propagation of protections.
>
> In this case, Docker just shouldn't request it for those Docker-related
> tasks, and they won't get any. That seems a lot simpler and more
> intuitive than special casing "0" in ancestors.
>
>> This patch doesn't take effect on any intermediate layer with
>> positive memory.min set, it requires all the ancestors having
>> 0 memory.min to work.
>>
>> Nothing special change, but more flexible to business deployment...
>
> Not so, this change is extremely "special". It violates the basic
> expectation that 0 means no possibility of propagation of protection,
> and I still don't see a compelling argument why Docker can't just set
> "max" in the intermediate cgroup and not accept any protection in leaf
> memcgs that it doesn't want protection for.

I got the reason, I'm using cgroup v1(with memory.min backport)
which permits tasks existent in "docker" cgroup.procs.

For cgroup v2, it's not a problem.

Thanks,
Xunlei