Re: [RFC 0/4] Introduce unbalance proactive reclaim

From: Huan Yang
Date: Mon Nov 13 2023 - 03:26:17 EST



在 2023/11/13 16:05, Huang, Ying 写道:
Huan Yang <link@xxxxxxxx> writes:

在 2023/11/13 14:10, Huang, Ying 写道:
Huan Yang <link@xxxxxxxx> writes:

在 2023/11/10 20:24, Michal Hocko 写道:
On Fri 10-11-23 11:48:49, Huan Yang wrote:
[...]
Also, When the application enters the foreground, the startup speed
may be slower. Also trace show that here are a lot of block I/O.
(usually 1000+ IO count and 200+ms IO Time) We usually observe very
little block I/O caused by zram refault.(read: 1698.39MB/s, write:
995.109MB/s), usually, it is faster than random disk reads.(read:
48.1907MB/s write: 49.1654MB/s). This test by zram-perf and I change a
little to test UFS.

Therefore, if the proactive reclamation encounters many file pages,
the application may become slow when it is opened.
OK, this is an interesting information. From the above it seems that
storage based IO refaults are order of magnitude more expensive than
swap (zram in this case). That means that the memory reclaim should
_in general_ prefer anonymous memory reclaim over refaulted page cache,
right? Or is there any reason why "frozen" applications are any
different in this case?
Frozen applications mean that the application process is no longer active,
so once its private anonymous page data is swapped out, the anonymous
pages will not be refaulted until the application becomes active again.

On the contrary, page caches are usually shared. Even if the
application that
first read the file is no longer active, other processes may still
read the file.
Therefore, it is not reasonable to use the proactive reclamation
interface to
reclaim page caches without considering memory pressure.
No. Not all page caches are shared. For example, the page caches used
for use-once streaming IO. And, they should be reclaimed firstly.
Yes, but this part is done very well in MGLRU and does not require our
intervention.
Moreover, the reclaim speed of clean files is very fast, but compared to it,
the reclaim speed of anonymous pages is a bit slower.
So, your solution may work good for your specific use cases, but it's
Yes, this approach is not universal.
not a general solution. Per my understanding, you want to reclaim only
private pages to avoid impact the performance of other applications.
Privately mapped anonymous pages is easy to be identified (And I suggest
that you can find a way to avoid reclaim shared mapped anonymous pages).
Yes, it is not good to reclaim shared anonymous pages, and it needs to be
identified. In the future, we will consider how to filter them.
Thanks.
There's some heuristics to identify use-once page caches in reclaiming
code. Why doesn't it work for your situation?
As mentioned above, the default reclaim algorithm is suitable for recycling
file pages, but we do not need to intervene in it.
Direct reclaim or kswapd of these use-once file pages is very fast and will
not cause lag or other effects.
Our overall goal is to actively and reasonably compress unused anonymous
pages based on certain strategies, in order to increase available memory to
a certain extent, avoid lag, and prevent applications from being killed.
Therefore, using the proactive reclaim interface, combined with LRU
algorithm
and reclaim tendencies, is a good way to achieve our goal.
If so, why can't you just use the proactive reclaim with some large
enough swappiness? That will reclaim use-once page caches and compress
This works very well for proactive memory reclaim that is only executed once.
However, considering that we need to perform proactive reclaim in batches,
suppose that only 5% of the use-once page cache in this memcg can be reclaimed,
but we need to call proactive memory reclaim step by step, such as 5%, 10%, 15% ... 100%.
Then, the page cache may be reclaimed due to the balancing adjustment of reclamation,
even if the 5% of use-once pages are reclaimed. We may still touch on shared file pages.
(If I misunderstood anything, please correct me.)

We previously used the two values of modifying swappiness to 200 and 0 to adjust reclaim
tendencies. However, the debug interface showed that some file pages were reclaimed,
and after being actively reclaimed, some applications and the reopened applications that were
reclaimed had some block IO and startup lag.

This way of having incomplete control over the process maybe is not suitable for proactive memory
reclaim. Instead, with an proactive reclaim interface with tendencies, we can issue a
5% page cache trim once and then gradually reclaim anonymous pages.
anonymous pages. So, more applications can be kept in memory before
passive reclaiming or killing background applications?

--
Best Regards,
Huang, Ying

--
Thanks,
Huan Yang