On Tue, 28 Jul 2015, Jörn Engel wrote:
Well, we definitely need something. Having a 100GB process show 3GB of
rss is not very useful. How would we notice a memory leak if it only
affects hugepages, for example?
Since the hugetlb pool is a global resource, it would also be helpful to
determine if a process is mapping more than expected. You can't do that
just by adding a huge rss metric, however: if you have 2MB and 1GB
hugepages configured you wouldn't know if a process was mapping 512 2MB
hugepages or 1 1GB hugepage.
That's the purpose of hugetlb_cgroup, after all, and it supports usage
counters for all hstates. The test could be converted to use that to
measure usage if configured in the kernel.
Beyond that, I'm not sure how a per-hstate rss metric would be exported to
userspace in a clean way and other ways of obtaining the same data are
possible with hugetlb_cgroup. I'm not sure how successful you'd be in
arguing that we need separate rss counters for it.