Re: [rfc patch 4/6] memcg: reclaim statistics

From: Johannes Weiner
Date: Tue May 17 2011 - 03:43:00 EST

On Mon, May 16, 2011 at 05:20:31PM -0700, Ying Han wrote:
> On Mon, May 16, 2011 at 4:10 PM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > On Thu, May 12, 2011 at 12:33:50PM -0700, Ying Han wrote:
> > > The stats for soft_limit reclaim from global ttfp have been merged in
> > > mmotm i believe as the following:
> > >
> > > "soft_direct_steal"
> > > "soft_direct_scan"
> > >
> > > I wonder we might want to separate that out from the other case where the
> > > reclaim is from the parent triggers its limit.
> >
> > The way I implemented soft limits in 6/6 is to increase pressure on
> > exceeding children whenever hierarchical reclaim is taking place.
> >
> > This changes soft limit from
> >
> > Global memory pressure: reclaim from exceeding memcg(s) first
> >
> > to
> >
> > Memory pressure on a memcg: reclaim from all its children,
> > with increased pressure on those exceeding their soft limit
> > (where global memory pressure means root_mem_cgroup and all
> > existing memcgs are considered its children)
> >
> > which makes the soft limit much more generic and more powerful, as it
> > allows the admin to prioritize reclaim throughout the hierarchy, not
> > only for global memory pressure. Consider one memcg with two
> > subgroups. You can now prioritize reclaim to prefer one subgroup over
> > another through soft limiting.
> >
> > This is one reason why I think that the approach of maintaining a
> > global list of memcgs that exceed their soft limits is an inferior
> > approach; it does not take the hierarchy into account at all.
> >
> > This scheme would not provide a natural way of counting pages that
> > were reclaimed because of the soft limit, and thus I still oppose the
> > merging of soft limit counters.
> The proposal we discussed during LSF ( implemented in the patch " memcg:
> revisit soft_limit reclaim on contention") takes consideration
> of hierarchical reclaim. The memcg is linked in the list if it exceeds the
> soft_limit, and the soft_limit reclaim per-memcg is calling
> mem_cgroup_hierarchical_reclaim().

It does hierarchical soft limit reclaim once triggered, but I meant
that soft limits themselves have no hierarchical meaning. Say you
have the following hierarchy:


aaa bbb

a1 a2 b1 b2


Consider aaa and a1 had a soft limit. If global memory arose, aaa and
all its children would be pushed back with the current scheme, the one
you are proposing, and the one I am proposing.

But now consider aaa hitting its hard limit. Regular target reclaim
will be triggered, and a1, a2, and a1-1 will be scanned equally from
hierarchical reclaim. That a1 is in excess of its soft limit is not
considered at all.

With what I am proposing, a1 and a1-1 would be pushed back more
aggressively than a2, because a1 is in excess of its soft limit and
a1-1 is contributing to that.

It would mean that given a group of siblings, you distribute the
pressure weighted by the soft limit configuration, independent of the
kind of hierarchical/external pressure (global memory scarcity or
parent hit the hard limit).

It's much easier to understand if you think of global memory pressure
to mean that root_mem_cgroup hit its hard limit, and that all existing
memcgs are hierarchically below the root_mem_cgroup. Altough it is
technically not implemented that way, that would be the consistent

My proposal is a generic and native way of enforcing soft limits: a
memcg hit its hard limit, reclaim from the hierarchy below it, prefer
those in excess of their soft limit.

While yours is special-cased to immediate descendants of the

> The current "soft_steal" and "soft_scan" is counting pages being steal/scan
> inside mem_cgroup_hierarchical_reclaim() w check_soft checking, which then
> counts pages being reclaimed because of soft_limit and also counting the
> hierarchical reclaim.

Yeah, I understand that. What I am saying is that in my code,
everytime a hierarchy of memcgs is scanned (global memory reclaim,
target reclaim, kswapd or direct, it's all the same), a memcg that is
in excess of its soft limit is put more pressure on compared to its

There is no stand-alone 'now, go reclaim soft limits' cycle anymore.
As such, it would be impossible to maintain that counter.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at