Re: [RFC] Shared page accounting for memory cgroup

From: Balbir Singh
Date: Mon Jan 18 2010 - 20:49:49 EST

On Tue, Jan 19, 2010 at 6:52 AM, Daisuke Nishimura
<nishimura@xxxxxxxxxxxxxxxxx> wrote:
>> Correct, file cache is almost always considered shared, so it has
>> 1. non-private or shared usage of 10MB
>> 2. 10 MB of file cache
>> > I don't think "non private usage" is appropriate to this value.
>> > Why don't you just show "sum_of_each_process_rss" ? I think it would be easier
>> > to understand for users.
>> Here is my concern
>> 1. The gap between looking at memcg stat and sum of all RSS is way
>> higher in user space
>> 2. Summing up all rss without walking the tasks atomically can and
>> will lead to consistency issues. Data can be stale as long as it
>> represents a consistent snapshot of data
>> We need to differentiate between
>> 1. Data snapshot (taken at a time, but valid at that point)
>> 2. Data taken from different sources that does not form a uniform
>> snapshot, because the timestamping of the each of the collected data
>> items is different
> Hmm, I'm sorry I can't understand why you need "difference".
> IOW, what can users or middlewares know by the value in the above case
> (0MB in 01 and 10MB in 02)? I've read this thread, but I can't understande about
> this point... Why can this value mean some of the groups are "heavy" ?

Consider a default cgroup that is not root and assume all applications
move there initially. Now with a lot of shared memory,
the default cgroup will be the first one to page in a lot of the
memory and its usage will be very high. Without the concept of
showing how much is non-private, how does one decide if the default
cgroup is using a lot of memory or sharing it? How
do we decide on limits of a cgroup without knowing its actual usage -
PSS equivalent for a region of memory for a task.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at