Re: [PATCH v3] vm_swappiness=0 should still try to avoid swapping anon memory

From: Johannes Weiner
Date: Tue Aug 10 2021 - 11:27:21 EST


Hello Nico,

On Mon, Aug 09, 2021 at 06:37:40PM -0400, Nico Pache wrote:
> Since commit 170b04b7ae49 ("mm/workingset: prepare the workingset detection
> infrastructure for anon LRU") and commit b91ac374346b ("mm: vmscan: enforce
> inactive:active ratio at the reclaim root") swappiness can start prematurely

Could clarify what you mean by "prematurely"?

The new balancing algorithm targets the lowest amount of overall
paging IO performed across the anon and file sets. It doesn't swap
unless it has an indication that a couple of swap writes are
outweighed by a reduction of reads on the cache side.

Is this not working for you?

> swapping anon memory. This is due to the assumption that refaulting anon should
> always allow the shrinker to target anon memory.

This doesn't sound right. Did you mean "refaulting file"?

> Add a check for swappiness being >0 before indiscriminately
> targeting Anon.

> Before these commits when a user had swappiness=0 anon memory would
> rarely get swapped; this behavior has remained constant sense
> RHEL5. This commit keeps that behavior intact and prevents the new
> workingset refaulting from challenging the anon memory when
> swappiness=0.

I'm wondering how you're getting anon scans with swappiness=0. If you
look at get_scan_count(), SCAN_FRACT with swappines=0 should always
result in ap = fraction[0] = 0, which never yields any anon scan
targets. So I'm thinking you're running into sc->file_is_tiny
situations, meaning remaining file pages alone are not enough to
restore watermarks anymore. Is that possible?

In that case, anon scanning is forced, and always has been. But the
difference is that before the above-mentioned patches, we'd usually
force scan just the smaller inactive list, whereas now we disable
active list protection due to swapins and target the entire anon
set. I suppose you'd prefer we go back to that, so that more pressure
remains proportionally on the file set, and just enough anon to get
above the watermarks again.

One complication I could see with that is that we no longer start anon
pages on the active list like we used to. We used to say active until
proven otherwise; now it's inactive until proven otherwise. It's
possible for the inactive list to contain a much bigger share of the
total anon set now than before, in which case your patch wouldn't have
the desired effect of targetting just a small amount of anon pages to
get over the watermark hump.

We may need a get_scan_count() solution after all, and I agree with
previous reviews that this is the better location for such an issue...

One thing I think we should do - whether we need more on top or not -
is allowing file reclaim to continue when sc->file_is_tiny. Yes, we
also need anon to meet the watermarks, but it's not clear why we
should stop scanning file pages altogether: it's possible they get us
there 99% of the way, and somebody clearly wanted us to swap as little
as possible to end up in a situation like that, so:

diff --git a/mm/vmscan.c b/mm/vmscan.c
index eeab6611993c..90dac3dc9903 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2477,7 +2477,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
* If the system is almost out of file pages, force-scan anon.
*/
if (sc->file_is_tiny) {
- scan_balance = SCAN_ANON;
+ scan_balance = SCAN_EQUAL;
goto out;
}