Re: [PATCH v2] kvm, mm: account shadow page tables to kmemcg

From: Michal Hocko
Date: Fri Jun 29 2018 - 10:55:21 EST


On Fri 29-06-18 16:40:23, Paolo Bonzini wrote:
> On 29/06/2018 16:30, Michal Hocko wrote:
> > I am not familiar wtih kvm to judge but if we are going to account this
> > memory we will probably want to let oom_badness know how much memory
> > to account to a specific process. Is this something that we can do?
> > We will probably need a new MM_KERNEL rss_stat stat for that purpose.
> >
> > Just to make it clear. I am not opposing to this patch but considering
> > that shadow page tables might consume a lot of memory it would be good
> > to know who is responsible for it from the OOM perspective. Something to
> > solve on top of this.
>
> The amount of memory is generally proportional to the size of the
> virtual machine memory, which is reflected directly into RSS. Because
> KVM processes are usually huge, and will probably dwarf everything else
> in the system (except firefox and chromium of course :)), the general
> order of magnitude of the oom_badness should be okay.

I think we will need MM_KERNEL longterm anyway. As I've said this is not
a must for this patch to go. But it is better to have a fair comparision
and kill larger processes if at all possible. It seems this should be
the case here.

> > I would also love to see a note how this memory is bound to the owner
> > life time in the changelog. That would make the review much more easier.
>
> --verbose for people that aren't well versed in linux mm, please...

Well, if the memory accounted to the memcg hits the hard limit and there
is no way to reclaim anything to reduce the charged memory then we have
to kill something. Hopefully the memory hog. If that one dies it would
be great it releases its charges along the way. My remark was just to
explain how that would happen for this specific type of memory. Bound to
a file, has its own tear down etc. Basically make life of reviewers
easier to understand the lifetime of charged objects without digging
deep into the specific subsystem.
--
Michal Hocko
SUSE Labs