Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2
From: Johannes Weiner
Date: Mon Dec 14 2015 - 10:48:54 EST
On Mon, Dec 14, 2015 at 04:30:37PM +0100, Michal Hocko wrote:
> On Thu 10-12-15 14:39:14, Vladimir Davydov wrote:
> > In the legacy hierarchy we charge memsw, which is dubious, because:
> >
> > - memsw.limit must be >= memory.limit, so it is impossible to limit
> > swap usage less than memory usage. Taking into account the fact that
> > the primary limiting mechanism in the unified hierarchy is
> > memory.high while memory.limit is either left unset or set to a very
> > large value, moving memsw.limit knob to the unified hierarchy would
> > effectively make it impossible to limit swap usage according to the
> > user preference.
> >
> > - memsw.usage != memory.usage + swap.usage, because a page occupying
> > both swap entry and a swap cache page is charged only once to memsw
> > counter. As a result, it is possible to effectively eat up to
> > memory.limit of memory pages *and* memsw.limit of swap entries, which
> > looks unexpected.
> >
> > That said, we should provide a different swap limiting mechanism for
> > cgroup2.
> > This patch adds mem_cgroup->swap counter, which charges the actual
> > number of swap entries used by a cgroup. It is only charged in the
> > unified hierarchy, while the legacy hierarchy memsw logic is left
> > intact.
>
> I agree that the previous semantic was awkward. The problem I can see
> with this approach is that once the swap limit is reached the anon
> memory pressure might spill over to other and unrelated memcgs during
> the global memory pressure. I guess this is what Kame referred to as
> anon would become mlocked basically. This would be even more of an issue
> with resource delegation to sub-hierarchies because nobody will prevent
> setting the swap amount to a small value and use that as an anon memory
> protection.
Overcommitting untrusted workloads is already problematic because
reclaim is based on heuristics and references, and a malicious
workload can already interfere with it and create pressure on the
system or its neighboring groups. This patch doesn't make it better,
but it's not a new problem.
If you don't trust subhierarchies, don't give them more memory than
you can handle them taking. And then giving them swap is a resource
for them to use on top of that memory, not for you at the toplevel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/