Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy

From: Johannes Weiner
Date: Mon Oct 26 2015 - 12:56:39 EST

On Fri, Oct 23, 2015 at 06:59:57AM -0700, David Miller wrote:
> From: Michal Hocko <mhocko@xxxxxxxxxx>
> Date: Fri, 23 Oct 2015 15:19:56 +0200
> > On Thu 22-10-15 00:21:33, Johannes Weiner wrote:
> >> Socket memory can be a significant share of overall memory consumed by
> >> common workloads. In order to provide reasonable resource isolation
> >> out-of-the-box in the unified hierarchy, this type of memory needs to
> >> be accounted and tracked per default in the memory controller.
> >
> > What about users who do not want to pay an additional overhead for the
> > accounting? How can they disable it?
> Yeah, this really cannot pass.
> This extra overhead will be seen by %99.9999 of users, since entities
> (especially distributions) just flip on all of these config options by
> default.

Okay, there are several layers to this issue.

If you boot a machine with a CONFIG_MEMCG distribution kernel and
don't create any cgroups, I agree there shouldn't be any overhead.

I already sent a patch to generally remove memory accounting on the
system or root level. I can easily update this patch here to not have
any socket buffer accounting overhead for systems that don't actively
use cgroups. Would you be okay with a branch on sk->sk_memcg in the
network accounting path? I'd leave that NULL on the system level then.

Then there is of course the case when you create cgroups for process
organization but don't care about memory accounting. Systemd comes to
mind. Or even if you create cgroups to track other resources like CPU
but don't care about memory. The unified hierarchy no longer enables
controllers on new cgroups per default, so unless you create a cgroup
and specifically tell it to account and track memory, you won't have
the socket memory accounting overhead, either.

Then there is the third case, where you create a control group to
specifically manage and limit the memory consumption of a workload. In
that scenario, a major memory consumer like socket buffers, which can
easily grow until OOM, should definitely be included in the tracking
in order to properly contain both untrusted (possibly malicious) and
trusted (possibly buggy) workloads. This is not a hole we can
reasonbly leave unpatched for general purpose resource management.

Now you could argue that there might exist specialized workloads that
need to account anonymous pages and page cache, but not socket memory
buffers. Or any other combination of pick-and-choose consumers. But
honestly, nowadays all our paths are lockless, and the counting is an
atomic-add-return with a per-cpu batch cache. I don't think there is a
compelling case for an elaborate interface to make individual memory
consumers configurable inside the memory controller.

So in summary, would you be okay with this patch if networking only
called into the memory controller when you explicitely create a cgroup
AND tell it to track the memory footprint of the workload in it?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at