[RFC PATCH 1/1] mm: only use old generation and stable tier for madv_pageout

From: zhaoyang.huang
Date: Fri Oct 13 2023 - 07:30:52 EST


From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>

Dropping pages of young generation or unstable tier via madvise could
make the system experience heavy page thrashing and IO pressure.
Furthermore, it could lead to failure of tier's PID controller which
affect normal reclaiming. I would like suggest skipping this pages in
madv_pageout.

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
---
include/linux/swap.h | 1 +
mm/madvise.c | 12 ++++++++++++
mm/vmscan.c | 3 ++-
3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 493487ed7c38..d09c859ccc45 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -496,6 +496,7 @@ extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
extern void exit_swap_address_space(unsigned int type);
extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
sector_t swap_page_sector(struct page *page);
+extern int get_tier_idx(struct lruvec *lruvec, int type);

static inline void put_swap_device(struct swap_info_struct *si)
{
diff --git a/mm/madvise.c b/mm/madvise.c
index 4dded5d27e7e..324d76096ca5 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -452,6 +452,18 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
if (!folio || folio_is_zone_device(folio))
continue;

+ if (lru_gen_enabled() && pageout) {
+ int gen = folio_lru_gen(folio);
+ struct lruvec *lruvec = folio_lruvec(folio);
+ int type = folio_is_file_lru(folio);
+ int refs = folio_lru_refs(folio);
+ int tier = lru_tier_from_refs(refs);
+ int tier_st = get_tier_idx(lruvec, type);
+
+ if (gen > lru_gen_from_seq(lruvec->lrugen.min_seq[type]) + 1
+ || tier > tier_st)
+ continue;
+ }
/*
* Creating a THP page is expensive so split it only if we
* are sure it's worth. Split it if we are only owner.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6f13394b112e..16900a8c13e0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5072,7 +5072,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
return isolated || !remaining ? scanned : 0;
}

-static int get_tier_idx(struct lruvec *lruvec, int type)
+int get_tier_idx(struct lruvec *lruvec, int type)
{
int tier;
struct ctrl_pos sp, pv;
@@ -5091,6 +5091,7 @@ static int get_tier_idx(struct lruvec *lruvec, int type)

return tier - 1;
}
+EXPORT_SYMBOL_GPL(get_tier_idx);

static int get_type_to_scan(struct lruvec *lruvec, int swappiness, int *tier_idx)
{
--
2.25.1