Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2

From: Johannes Weiner
Date: Tue Dec 15 2015 - 09:50:32 EST


On Tue, Dec 15, 2015 at 12:22:41PM +0900, Kamezawa Hiroyuki wrote:
> On 2015/12/15 4:42, Vladimir Davydov wrote:
> >Anyway, if you don't trust a container you'd better set the hard memory
> >limit so that it can't hurt others no matter what it runs and how it
> >tweaks its sub-tree knobs.
>
> Limiting swap can easily cause "OOM-Killer even while there are available swap"
> with easy mistake. Can't you add "swap excess" switch to sysctl to allow global
> memory reclaim can ignore swap limitation ?

That never worked with a combined memory+swap limit, either. How could
it? The parent might swap you out under pressure, but simply touching
a few of your anon pages causes them to get swapped back in, thrashing
with whatever the parent was trying to do. Your ability to swap it out
is simply no protection against a group touching its pages.

Allowing the parent to exceed swap with separate counters makes even
less sense, because every page swapped out frees up a page of memory
that the child can reuse. For every swap page that exceeds the limit,
the child gets a free memory page! The child doesn't even have to
cause swapin, it can just steal whatever the parent tried to free up,
and meanwhile its combined memory & swap footprint explodes.

The answer is and always should have been: don't overcommit untrusted
cgroups. Think of swap as a resource you distribute, not as breathing
room for the parents to rely on. Because it can't and could never.

And the new separate swap counter makes this explicit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/