Re: [patch v2] mm, vmscan: avoid thrashing anon lru when free + file is low

From: Michal Hocko
Date: Wed May 03 2017 - 02:15:43 EST


On Tue 02-05-17 13:41:23, David Rientjes wrote:
> On Tue, 2 May 2017, Michal Hocko wrote:
>
> > I have already asked and my questions were ignored. So let me ask again
> > and hopefuly not get ignored this time. So Why do we need a different
> > criterion on anon pages than file pages?
>
> The preference in get_scan_count() as already implemented is to reclaim
> from file pages if there is enough memory on the inactive list to reclaim.
> That is unchanged with this patch.

My fault, I was too vague. My question was basically why should we use
a different criterion to SCAN_ANON than SCAN_FILE.

> > I do agree that blindly
> > scanning anon pages when file pages are low is very suboptimal but this
> > adds yet another heuristic without _any_ numbers. Why cannot we simply
> > treat anon and file pages equally? Something like the following
> >
> > if (pgdatfile + pgdatanon + pgdatfree > 2*total_high_wmark) {
> > scan_balance = SCAN_FILE;
> > if (pgdatfile < pgdatanon)
> > scan_balance = SCAN_ANON;
> > goto out;
> > }
> >
>
> This would be substantially worse than the current code because it
> thrashes the anon lru when anon out numbers file pages rather than at the
> point we fall under the high watermarks for all eligible zones. If you
> tested your suggestion, you could see gigabytes of memory left untouched
> on the file lru. Anonymous memory is more probable to be part of the
> working set.

This was supposed to be more an example of a direction I was thinking,
definitely not a final patch. I will think more to come up with a
more complete proposal.

> > Also it would help to describe the workload which can trigger this
> > behavior so that we can compare numbers before and after this patch.
>
> Any workload that fills system RAM with anonymous memory that cannot be
> reclaimed will thrash the anon lru without this patch.

I have already asked, but I do not understand why this anon memory
couldn't be reclaimed. Who is pinning it? Why cannot it be swapped out?
If it is mlocked it should be moved to unevictable LRU. What am I
missing?

--
Michal Hocko
SUSE Labs