Re: [PATCH] memcg: expose root cgroup's memory.stat

From: Shakeel Butt
Date: Sat May 09 2020 - 10:07:02 EST


On Fri, May 8, 2020 at 2:44 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>
> On Fri, May 08, 2020 at 10:06:30AM -0700, Shakeel Butt wrote:
> > One way to measure the efficiency of memory reclaim is to look at the
> > ratio (pgscan+pfrefill)/pgsteal. However at the moment these stats are
> > not updated consistently at the system level and the ratio of these are
> > not very meaningful. The pgsteal and pgscan are updated for only global
> > reclaim while pgrefill gets updated for global as well as cgroup
> > reclaim.
> >
> > Please note that this difference is only for system level vmstats. The
> > cgroup stats returned by memory.stat are actually consistent. The
> > cgroup's pgsteal contains number of reclaimed pages for global as well
> > as cgroup reclaim. So, one way to get the system level stats is to get
> > these stats from root's memory.stat, so, expose memory.stat for the root
> > cgroup.
> >
> > from Johannes Weiner:
> > There are subtle differences between /proc/vmstat and
> > memory.stat, and cgroup-aware code that wants to watch the full
> > hierarchy currently has to know about these intricacies and
> > translate semantics back and forth.
> >
> > Generally having the fully recursive memory.stat at the root
> > level could help a broader range of usecases.
>
> The changelog begs the question why we don't just "fix" the
> system-level stats. It may be useful to include the conclusions from
> that discussion, and why there is value in keeping the stats this way.
>

Right. Andrew, can you please add the following para to the changelog?

Why not fix the stats by including both the global and cgroup reclaim
activity instead of exposing root cgroup's memory.stat? The reason is
the benefit of having metrics exposing the activity that happens
purely due to machine capacity rather than localized activity that
happens due to the limits throughout the cgroup tree. Additionally
there are userspace tools like sysstat(sar) which reads these stats to
inform about the system level reclaim activity. So, we should not
break such use-cases.

> > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
> > Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>

Thanks a lot.