Re: [patch] mm, vmscan: avoid thrashing anon lru when free + file is low

From: Minchan Kim
Date: Tue Apr 18 2017 - 20:15:22 EST


Hi David,

On Tue, Apr 18, 2017 at 02:32:56PM -0700, David Rientjes wrote:
> On Tue, 18 Apr 2017, Minchan Kim wrote:
>
> > > The purpose of the code that commit 623762517e23 ("revert 'mm: vmscan: do
> > > not swap anon pages just because free+file is low'") reintroduces is to
> > > prefer swapping anonymous memory rather than trashing the file lru.
> > >
> > > If all anonymous memory is unevictable, however, this insistance on
> >
> > "unevictable" means hot workingset, not (mlocked and increased refcount
> > by some driver)?
> > I got confused.
> >
>
> For my purposes, it's mlocked, but I think this thrashing is possible
> anytime we fail the file lru heuristic and the evictable anon lrus are
> very small themselves. I'll update the changelog to make this explicit.

I understood now. Thanks for clarifying.

>
> > > Check that enough evictable anon memory is actually on this lruvec before
> > > insisting on SCAN_ANON. SWAP_CLUSTER_MAX is used as the threshold to
> > > determine if only scanning anon is beneficial.
> >
> > Why do you use SWAP_CLUSTER_MAX instead of (high wmark + free) like
> > file-backed pages?
> > As considering anonymous pages have more probability to become workingset
> > because they are are mapped, IMO, more {strong or equal} condition than
> > file-LRU would be better to prevent anon LRU thrashing.
> >
>
> If the suggestion is checking
> NR_ACTIVE_ANON + NR_INACTIVE_ANON > total_high_wmark pages, it would be a
> separate heurstic to address a problem that I'm not having :) My issue is
> specifically when NR_ACTIVE_FILE + NR_INACTIVE_FILE < total_high_wmark,
> NR_ACTIVE_ANON + NR_INACTIVE_ANON is very large, but all not on this
> lruvec's evictable lrus.

I understand it as "all not eligible LRU lists". Right?
I will write the comment below with that my assumption is right.

>
> This is the reason why I chose lruvec_lru_size() rather than per-node
> statistics. The argument could also be made for the file lrus in the
> get_scan_count() heuristic that forces SCAN_ANON, but I have not met such
> an issue (yet). I could follow-up with that change or incorporate it into
> a v2 of this patch if you'd prefer.

I don't think we need to fix that part because the logic is to keep
some amount of file-backed page workingset regardless of eligible
zones.

>
> In other words, I want get_scan_count() to not force SCAN_ANON and
> fallback to SCAN_FRACT, absent other heuristics, if the amount of
> evictable anon is below a certain threshold for this lruvec. I
> arbitrarily chose SWAP_CLUSTER_MAX to be conservative, but I could easily
> compare to total_high_wmark as well, although I would consider that more
> aggressive.

I realize your problem now. It's rather different heuristic so no need
to align file-lru. But SWAP_CLUSTER_MAX is too conservatie, too. IMHO.

How about this?

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 24efcc20af91..5d2f3fa41e92 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2174,8 +2174,17 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
}

if (unlikely(pgdatfile + pgdatfree <= total_high_wmark)) {
- scan_balance = SCAN_ANON;
- goto out;
+ /*
+ * force SCAN_ANON if inactive anonymous LRU lists of
+ * eligible zones are enough pages. Otherwise, thrashing
+ * can be happen on the small anonymous LRU list.
+ */
+ if (!inactive_list_is_low(lruvec, false, NULL, sc, false) &&
+ lruvec_lru_size(lruvec, LRU_INACTIVE_ANON, sc->reclaim_idx)
+ >> sc->priority) {
+ scan_balance = SCAN_ANON;
+ goto out;
+ }
}
}


Thanks.