Re: [RFC 0/4] Introduce unbalance proactive reclaim

From: Huan Yang
Date: Thu Nov 09 2023 - 08:07:44 EST


HI,

在 2023/11/9 20:40, Michal Hocko 写道:
[Some people who received this message don't often get email from mhocko@xxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Thu 09-11-23 18:50:36, Huan Yang wrote:
在 2023/11/9 18:39, Michal Hocko 写道:
[Some people who received this message don't often get email from mhocko@xxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Thu 09-11-23 18:29:03, Huan Yang wrote:
HI Michal Hocko,

Thanks for your suggestion.

在 2023/11/9 17:57, Michal Hocko 写道:
[Some people who received this message don't often get email from mhocko@xxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Thu 09-11-23 11:38:56, Huan Yang wrote:
[...]
If so, is it better only to reclaim private anonymous pages explicitly?
Yes, in practice, we only proactively compress anonymous pages and do not
want to touch file pages.
If that is the case and this is mostly application centric (which you
seem to be suggesting) then why don't you use madvise(MADV_PAGEOUT)
instead.
Madvise may not be applicable in this scenario.(IMO)

This feature is aimed at a core goal, which is to compress the anonymous
pages
of frozen applications.

How to detect that an application is frozen and determine which pages can be
safely reclaimed is the responsibility of the policy part.

Setting madvise for an application is an active behavior, while the above
policy
is a passive approach.(If I misunderstood, please let me know if there is a
better
way to set madvise.)
You are proposing an extension to the pro-active reclaim interface so
this is an active behavior pretty much by definition. So I am really not
following you here. Your agent can simply scan the address space of the
application it is going to "freeze" and call pidfd_madvise(MADV_PAGEOUT)
on the private memory is that is really what you want/need.
There is a key point here. We want to use the grouping policy of memcg
to perform proactive reclamation with certain tendencies. Your
suggestion is to reclaim memory by scanning the task process space.
However, in the mobile field, memory is usually viewed at the
granularity of an APP.
OK, sthis is likely a terminology gap on my end. By application you do
not really mean a process but rather a whole cgroup. That would have
been really useful to be explicit about.
I'm sorry for the confusion. But, in reality, the example I gave was just the one we use
here. In terms of policy, any reasonable method can be chosen to organize cgroups
and reclaim memory with certain tendencies.
But, let's continue the discussion assuming that memcg is grouped by application to
avoid confusion.

Therefore, after an APP is frozen, we hope to reclaim memory uniformly
according to the pre-grouped APP processes.

Of course, as you suggested, madvise can also achieve this, but
implementing it in the agent may be more complex.(In terms of
achieving the same goal, using memcg to group all the processes of an
APP and perform proactive reclamation is simpler than using madvise
and scanning multiple processes of an application using an agent?)
It might be more involved but the primary question is whether it is
usable for the specific use case. Madvise interface is not LRU aware but
you are not really talking about that to be a requirement? So it would
really help if you go deeper into details on how is the interface
actually supposed to be used in your case.
In mobile field, we usually configure zram to compress anonymous page.
We can approximate to expand memory usage with limited hardware memory
by using zram.

With proper strategies, an 8GB RAM phone can approximate the usage of a 12GB phone
(or more).

In our strategy, we group memcg by application. When the agent detects that an
application has entered the background, then frozen, and has not been used for a long time,
the agent will slowly issue commands to reclaim the anonymous page of that application.

With this interface, `echo memory anon > memory.reclaim`


Also make sure to exaplain why you cannot use other existing interfaces.
For example, why you simply don't decrease the limit of the frozen
cgroup and rely on the normal reclaim process to evict the most cold
This is a question of reclamation tendency, and simply decreasing the limit of the frozen
cgroup cannot achieve this.
memory? What are you basing your anon vs. file proportion decision on?
When zram is configured and anonymous pages are reclaimed proactively, the refault
probability of anonymous pages is low when an application is frozen and not reopened.
Also, the cost of refaulting from zram is relatively low.

However, file pages usually have shared properties, so even if an application is frozen,
other processes may still access the file pages. If a limit is set and the reclamation encounters
file pages, it will cause a certain amount of refault I/O, which is costly for mobile devices.

Therefore, we want to have a proactive reclamation interface that has a tendency to only
reclaim anonymous pages rather than file pages.

By doing so, more application data can be stored in the background, and when the application
is reopened from the background, cold start can be avoided.(Cold start means that the application
needs to reload the required data and reinitialize its running logic.)

In other words more details, ideally with some numbers and make sure to
describe why existing APIs cannot be used.
--
Michal Hocko
SUSE Labs

--
Thanks,
Huan Yang