Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy

From: Michal Hocko
Date: Fri Nov 06 2015 - 07:51:48 EST


On Thu 05-11-15 17:32:51, Johannes Weiner wrote:
> On Thu, Nov 05, 2015 at 05:28:03PM +0100, Michal Hocko wrote:
[...]
> > Yes, that part is clear and Johannes made it clear that the kmem tcp
> > part is disabled by default. Or are you considering also all the slab
> > usage by the networking code as well?
>
> Michal, there shouldn't be any tracking or accounting going on per
> default when you boot into a fresh system.
>
> I removed all accounting and statistics on the system level in
> cgroupv2, so distribution kernels can compile-time enable a single,
> feature-complete CONFIG_MEMCG that provides a full memory controller
> while at the same time puts no overhead on users that don't benefit
> from mem control at all and just want to use the machine bare-metal.

Yes that part is clear and I am not disputing it _at all_. It is just
that changes are high that memory controller _will_ be enabled in a
typical distribution systems. E.g. systemd _is_ enabling all resource
controllers by default for some services with Delegate=yes option.

> This is completely doable. My new series does it for skmem, but I also
> want to retrofit the code to eliminate that current overhead for page
> cache, anonymous memory, slab memory and so forth.
>
> This is the only sane way to make the memory controller powerful and
> generally useful without having to make unreasonable compromises with
> memory consumers. We shouldn't even be *having* the discussion about
> whether we should sacrifice the quality of our interface in order to
> compromise with a class of users that doesn't care about any of this
> in the first place.
>
> So let's eliminate the cost for non-users, but make the memory
> controller feature-complete and useful--with reasonable cost,
> implementation, and interface--for our actual userbase.
>
> Paying the necessary cost for a functionality you actually want is not
> the problem. Paying for something that doesn't benefit you is.

I completely agree that a reasonable cost for those who _want_ the
functionality. It hasn't been shown that people actually lack kmem
accounting in the wild from the past in general. E.g. kmem controller
is even not enabled in opensuse nor SLES kernels and I do not remember
there was huge push to enable it.

I do understand that you want to have an out-of-the-box isolation
behavior which I agree is a nice-to-have feature. Especially with
a larger penetration of containerized workloads. But my point still
holds. This is not something everybody wants to have. So have a
configuration and a boot time option to override is the most reasonable
way to go. You can clearly see that this is already demand from tcp
kmem extension because they really _care_ about every single cpu cycle
even though some part of the userspace happens to have memcg enabled.

The question about the configuration default is a different question
and we can discuss that because this is not an easy one to decide right
now IMHO.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/