Re: [PATCH v4 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

From: Waiman Long
Date: Mon Apr 07 2025 - 10:50:56 EST


On 4/7/25 10:24 AM, Johannes Weiner wrote:
On Sun, Apr 06, 2025 at 09:41:58PM -0400, Waiman Long wrote:
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test due to the fact that two of its test child cgroups which
have a memmory.low of 0 or an effective memory.low of 0 still have low
events generated for them since mem_cgroup_below_low() use the ">="
operator when comparing to elow.

The two failed use cases are as follows:

1) memory.low is set to 0, but low events can still be triggered and
so the cgroup may have a non-zero low event count. I doubt users are
looking for that as they didn't set memory.low at all.

2) memory.low is set to a non-zero value but the cgroup has no task in
it so that it has an effective low value of 0. Again it may have a
non-zero low event count if memory reclaim happens. This is probably
not a result expected by the users and it is really doubtful that
users will check an empty cgroup with no task in it and expecting
some non-zero event counts.

In the first case, even though memory.low isn't set, it may still have
some low protection if memory.low is set in the parent. So low event may
still be recorded. The test_memcontrol.c test has to be modified to
account for that.

For the second case, it really doesn't make sense to have non-zero
low event if the cgroup has 0 usage. So we need to skip this corner
case in shrink_node_memcgs() by skipping the !usage case. The
"#ifdef CONFIG_MEMCG" directive is added to avoid problem with the
non-CONFIG_MEMCG case.

With this patch applied, the test_memcg_low sub-test finishes
successfully without failure in most cases. Though both test_memcg_low
and test_memcg_min sub-tests may still fail occasionally if the
memory.current values fall outside of the expected ranges.

Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
mm/vmscan.c | 10 ++++++++++
tools/testing/selftests/cgroup/test_memcontrol.c | 7 ++++++-
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b620d74b0f66..65dee0ad6627 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5926,6 +5926,7 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
return inactive_lru_pages > pages_for_compaction;
}
+#ifdef CONFIG_MEMCG
static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
{
struct mem_cgroup *target_memcg = sc->target_mem_cgroup;
@@ -5963,6 +5964,10 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
mem_cgroup_calculate_protection(target_memcg, memcg);
+ /* Skip memcg with no usage */
+ if (!page_counter_read(&memcg->memory))
+ continue;
Please use mem_cgroup_usage() like I had originally suggested.

The !CONFIG_MEMCG case can be done like its root cgroup branch.
Will do that.

if (mem_cgroup_below_min(target_memcg, memcg)) {
/*
* Hard protection.
@@ -6004,6 +6009,11 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
}
} while ((memcg = mem_cgroup_iter(target_memcg, memcg, partial)));
}
+#else
+static inline void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
+{
+}
+#endif /* CONFIG_MEMCG */
You made the entire reclaim path a nop for !CONFIG_MEMCG.

Yes, that is probably not right. Will fix that.

Cheers,
Longman