Re: [PATCH] meminfo: show /proc/meminfo base on container's memcg

From: Glauber Costa
Date: Tue May 29 2012 - 04:26:39 EST


On 05/29/2012 06:56 AM, Gao feng wrote:
cgroup and namespaces are used for creating containers but some of
information is not isolated/virtualized. This patch is for isolating /proc/meminfo
information per container, which uses memory cgroup. By this, top,free
and other tools under container can work as expected(show container's
usage) without changes.

This patch is a trial to show memcg's info in /proc/meminfo if 'current'
is under a memcg other than root.

we show /proc/meminfo base on container's memory cgroup.
because there are lots of info can't be provide by memcg, and
the cmds such as top, free just use some entries of /proc/meminfo,
we replace those entries by memory cgroup.

if container has no memcg, we will show host's /proc/meminfo
as before.

there is no idea how to deal with Buffers,I just set it zero,
It's strange if Buffers bigger than MemTotal.

Signed-off-by: Gao feng<gaofeng@xxxxxxxxxxxxxx>

This is the very same problem that exists with CPU cgroup.
So I'll tell you what kind of resistance we faced, and why I sort of agree with it.

In summary, there is no guarantee that current wants to see this. This is true for containers environments, but doing this unconditionally can
break applications out there.

With cpu is easier to demonstrate, because you would still see all other tasks in the system (no pid namespaces used), but the tick figures won't match. But not only memory falls prey to the same issue,
but we really need a common solution to that.

A flag is too ugly, or mount options are too ugly, and when parenting is in place, hard to get right.

So I've seen a lot of people advocating we should just use a userspace filesystem that would bind mount that ontop of normal proc.

For instance: bind mount your special meminfo into /proc/meminfo inside a container. Reads of the later would redirect to the former, that would then assemble the proper results from the cgroup filesystem, and display it.

I do believe this is a more neutral way to go, and we have all the tools. It also does not risk breaking anything, since only people that want it would use it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/